<p><span lang="EN-US">This research investigates challenges and advancements in neural machine translation (NMT), specifically targeting English-to-Kannada translation. Emphasizing the scarcity of data and linguistic complexity in low-resource languages (LRL), particularly Kannada, the study underscores the need for specialized techniques. Starting with exploration of Kannada's historical and cultural significance, the paper highlights critical importance of linguistic comprehension. The primary objective is to develop robust NMT models for precise and contextually relevant translations in low-resource scenarios. The novelty of this research lies in its innovative approach to Kannada NMT challenges, incorporating comprehensive examination of historical and cultural context to establish strong linguistic foundation. Motivated by the urgency to address translation needs in LRL, the paper proposes novel strategies, advocating notably for backtranslation to generate synthetic parallel corpora. Rigorous testing, including bilingual evaluation understudy (BLEU) score assessments, evaluates effectiveness of these proposed approaches. Beyond assessing backtranslation, the study explores challenges faced by Kannada NMT in handling dialectical and spelling variations. The research reports substantial 83-percentage-point average increase in BLEU scores, contingent on aligning unique Kannada terms with the same domain as existing occurrences. This study contributes significantly to Kannada natural language processing by offering novel insights into NMT intricacies and providing practical solutions for enhancing translation accuracy in low-resource settings.</span></p>
Read full abstract