The development of the disambiguation component of a Yorùbá to English machine translation system is hindered by several factors. One of these is the lack of machine readable sense inventory for ambiguous Yorùbá words. This study addressed the problem by developing a machine readable dictionary for ambiguous Yoruba verbs. To achieve this, ambiguous Yorba verbs and their translations were collected from existing bilingual dictionaries. The collected lexicons were transformed into machine readable format using the Extensible Markup Language (XML) Format. The accuracy of translation of the machine readable dictionary was evaluated using mean opinion score, with a score of 4.37 over the scale of 5. This study covered the total number of ninety-three (93) monosyllabic verbs with two hundred and forty-one (241) senses, which gives a coverage of 69.5% of the ambiguous monosyllabic verbs in Yoruba Language. The sense inventory was also used as a component of a Yoruba Word Sense Disambiguation system, and an accuracy of 94.6% was achieved. This study concludes that the digitized data can increase the accuracy of Word Sense Disambiguation component of a Yorùbá to English machine translation system.
Read full abstract