<p>With the rise of machine translation systems, it has become essential to evaluate the quality of translations produced by these systems. However, the existing evaluation metrics designed for English and other European languages may not always be suitable or apply to other Indic languages due to their complex morphology and syntax. Machine translation evaluation (MTE) is a process of assessing the quality and accuracy of the machine-translated text. MTE involves comparing the machine-translated output with the reference translation to calculate the level of similarity and correctness. Therefore, this study evaluates different metrics, namely, BLEU, METEOR, and TER to identify the most suitable evaluation metric for Indic languages. The study uses datasets for Indic languages and evaluates the metrics on various translation systems. The study contributes to the field of MT by providing insights into suitable evaluation metrics for Indic languages. This research paper aims to study and compare several lexical automatic machine translation evaluation metrics for Indic languages. For this research analysis, we have selected five language pairs of parallel corpora from the low-resource domain, such as English–Hindi, English-Punjabi, English-Gujarati, English-Marathi, and English-Bengali. All these languages belong to the Indo-Aryan language family and are resource-poor. A comparison of the state of art MT is presented and shows which translator works better on these language pairs. For this research work, the natural language toolkit tokenizers are used to assess the analysis of the experimental results. These results have been performed by taking two different datasets for all these language pairs using fully automatic MT evaluation metrics. The research study explores the effectiveness of these metrics in assessing the quality of machine translations between various Indic languages. Additionally, this dataset and analysis will make it easier to do future research in Indian MT evaluation.</p>
Read full abstract