Abstract

The object of this work is to develop a script for evaluating the ability of online translators to translate text from one language to another. For this purpose, we used Google Translate and Yandex.Translate. Examples from English, Kazakh and Russian languages were used for the analysis of 147 news items and about 1800 sentences. The texts are taken from an Internet resource astana.gov.kz. A corpus of parallel texts for three languages has been created. We used development for the “sentence” pattern with the prospect of further development for the “text” pattern. We analyzed errors in the following categories: untranslated/omitted words, extra words, incorrect word endings, incorrect word order, punctuation errors, mutilate translation and incorrect translation. Based on the analysis of the obtained data we have concluded that it is better to do the translation of the Russian text into Kazakh or English in the YandexTranslate than in Google Translate. The developed comparison script and error analysis script are available on the Internet in open access.

Highlights

  • IN the list of languages by number of native speakers, English is ranked 3rd (379 million of people), Russian – 7th (154 million of people), and Kazakh – 76th (12.9 million of people) [1]

  • Examples from English, Kazakh and Russian languages were used for the analysis of 147 news items and about 1800 sentences

  • We used Google Translate [4] and Yandex.Translate [5], examples from English, Kazakh and Russian languages were used for the analysis

Read more

Summary

INTRODUCTION

IN the list of languages by number of native speakers, English is ranked 3rd (379 million of people), Russian – 7th (154 million of people), and Kazakh – 76th (12.9 million of people) [1]. The object of this work is to develop a script for evaluating the ability of online translators to translate text from one language to another For this purpose, we used Google Translate [4] and Yandex.Translate [5], examples from English, Kazakh and Russian languages were used for the analysis. Translation analysis was performed by the similar_text function [11] and token_set_ratio from the FuzzyWuzzy library [12] for the “sentence” pattern. Creation of a script for evaluating the quality of translation using the coefficients similar_text/token_set_ratio for different language pairs can be considered as the research contribution. The system calculates the similar_text/token_set_ratio coefficients for each sentence using the created script, which is described in this article Based on these coefficients decides: to display the Google Translate or Yandex.Translate translations. The user gets the most reliable translation, which is compiled with the help of two of the most popular online translators in Kazakhstan

RESEARCH
RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call