Abstract

Research on languages, particularly regional languages, is extremely challenging to conduct because there is very little or no language corpus available, particularly for Indonesia's regional languages. This project seeks to construct a translation machine for Indonesian in Meher and Woirata languages, and vice versa. However, to be able to achieve this, a corpus of Meher and Woirata languages must first be developed. The production of this corpus was carried out through field studies, the researcher requested various speakers of this language to translate manually and then compared the results from several translators through focus group talks to identify the appropriate use of words. The outcomes of this translation process are then written in the form of a database of Indonesian-Meher and Indonesian-Woirata language pairings which will subsequently be utilized as a learning database for the translation machine that will be created. This research succeeded in collecting 714.000 words in the Meher language and 805.000 words in the Woirata language. These results were then employed as a machine translation learning corpus, the output of the translation carried out by this machine was then validated through direct assessment by speakers of the two languages. The results of this testing indicated an accuracy above 80% for both translation into the Meher language and translation into the Woirata language. From the research carried out, it can be concluded that the construction of the Meher language corpus and the Woirata language corpus which was carried out through field research was successful in gathering and establishing a language corpus for these two languages. Apart from that, the experimental results suggest that the employment of translation algorithms to convert Indonesian into regional languages and vice versa may be carried out and provide translations with acceptable accuracy. The contribution of this research is in the establishment of the Meher and Woirata language corpus so that it can be generally accessed by anyone who requires it.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.