Abstract
In this paper, we described an effort towards the development of parallel corpora for English and Ethiopian Languages, such as Wolaita, Gamo, Gofa, and Dawuro neural machine translation. The corpus is collected from the religious domain and to check the usability of the collected parallel corpora a bi-directional Neural Machine Translation experiments were conducted. The neural machine translation shows good results as a baseline experiment of BLEU score of 13.8 in Wolaita-English and 8.2 English-Wolaita machine translation. The Wolaita-English translation shows a better result than the other pairs of Ethiopian languages and the result of neural machine translation performs well when the amount of dataset increases, thus the amount of dataset has a great impact on the performance. Besides these, the morphological richness of Ethiopian language contributed to the low performance of neural machine translation when the Ethiopian language is used as the target language. Further, we are working on minimizing the effect of morphological richness through different morphological processing techniques in the translation of Ethiopian languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.