Work place: Obafemi Awolowo University, Ile-Ife, Nigeria
E-mail: sobusola@oauife.edu.ng
Website:
Research Interests: Natural Language Processing, Programming Language Theory
Biography
Asahiah F. O. was born in 1972 and earned his B.Sc., M.Sc. and PhD from the Obafemi Awolowo University, Ile-Ife, Nigeria in 1997, 2005 and 2014 respectively. He joined the Department of Computer Science and Engineering of Obafemi Awolowo University, Ile-Ife as a Graduate Assistant and has since risen to the position of Senior Lecturer. He has several publications to his credit including "The development of a syllabicator for Yorùbá language (Proceedings of OAUTekConf, 2010, Nigeria)", "Restoring Tone-Marks in Standard Yorùbá Electronic Text: Improved Model (Computer Science, 18(3), Poland)" "Survey of Diacritic Restoration in Abjad and Alphabet Writing Systems (Journal of Natural Language Engineering, 24(1), 2018, UK)" and "Computational Modelling of an Optical Character Recognition System for Yorùbá Printed Text Images (Scientific Africa, Vol 9, 2020)". His research interest is in Human language processing especially in developing resources for low-resourced languages and application of machine learning to text processing.
By Asahiah Franklin Oladiipo Onifade Mary Taiwo Adegunlehin Abayomi Emmanuel
DOI: https://doi.org/10.5815/ijieeb.2020.06.03, Pub. Date: 8 Dec. 2020
While writing in most of the world’s major languages have a long history, Yorùbá is a relatively young language as far as writing it down is concerned. It is therefore an under-resourced language as far as tools for processing it in digital format is concerned. Spell checking is one of these tools. An analysis of the spelling error pattern is fundamental to the task of producing a good spell checker. We addressed this challenge in this article and our findings showed that spelling error pattern in Yorùbá followed that of other languages in general. There were, however, obvious departure from the norms in the specific. Diacritic-related misspelling accounted for more than 80% of all errors and words with single edit error were less than the generally expected minimum threshold of 80%. In addition, most of the errors were vowel-related with consonants accounting for less than 15% of all errors. Word-length does not seem to have any direct bearing on number of errors in a word. The research showed that the impact of diacritics on spelling error is more in Yorùbá where diacritics are majorly used for tone marking where it accounts for more than 80% of spelling errors than in languages like Brazilian Portuguese and Spanish where diacritics are used for differentiating characters where spelling error due to diacritics covered less than 60% of all errors. We thus conclude that while, to a significant extent, the character set used in a language determines distribution of spelling error, the purpose to which diacritics is employed in language also affect the distribution of spelling error in a language.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals