Abstract

This research aims to produce a statistical machine translation that can be implemented to perform Javanese-Indonesian translation and to know the influence of the main data sources of statistical machine translation namely parallel corpus and monolingual corpus on the quality of Javanese-Indonesian statistical machine translation. The testing was carried out by gradually adding the quantity of parallel corpus and monolingual corpus to seven configurations of Javanese-Indonesian statistical machine translation. All machine translation configuration experiments were tested with test data totaling 500 lines of Javanese sentences. Results from machine translation are evaluated automatically using Bilingual Evaluation Understudy (BLEU). Test results in seven configurations showed an increase in the evaluation value of the translation machine after the quantity of parallel corpus and monolingual corpus was added. The quantity of parallel corpus in configurations 1 and 2 increased by 3,6%, configurations 2 and 3 increased by 8,23%, configurations 3 and 7 increased by 14,92%. Additional monolingual corpus quantity in configurations 4 and 5 increased BLEU score by 0,18%, configurations 5 and 6 increased by 0,06%, configurations 6 and 7 increased by 0,24%. The test results showed that the quantity of parallel corpus and monolingual corpus could increase the evaluation value of statistical machine translation Javanese-Indonesian, but the quantity of parallel corpus had a greater influence than the quantity of monolingual corpus

Highlights

  • Humans are individual and social beings that interact with each other

  • The entire configuration of the machine translation was tested with the same test data of 500 Javanese sentences. 3.1 Implementation Statistical Machine Translation Javanese-Indonesian The first phase of implementation of the statistical machine translation Javanese-Indonesian in this research began with the preprocessing stage

  • Machine translation testing by adding parallel corpus quantity gradually resulted in a machine translation configuration 1 with a Bilingual Evaluation Understudy (BLEU) score of 39.79%, configuration 2 obtained a BLEU score of 43.39%, configuration 3 obtained a BLEU score of 51.62% and configuration 7 obtained a BLEU score of 66.54%

Read more

Summary

Introduction

Language is required to convey purpose and objectives to others. It can be difficult to convey those purposes and objectives. Language is an important component in everyday life that becomes a means of interacting and communicating between individuals. The number of regional languages in Indonesia is very large. This is because each province has several regional languages at once. Of the 34 provinces in Indonesia, 718 regional languages have been identified as of 2020. One of the regional languages with the most speakers is Javanese. Javanese ngoko is the Javanese language most commonly used by ethnic Javanese people. The number of regional languages in Indonesia makes not all Indonesians master it. Most Indonesians only speak the regional language used in their region

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call