Abstract

Machine translation (MT) aims to remove linguistic barriers and enables communication by allowing languages to be automatically translated. The availability of a substantial parallel corpus determines the quality of translations produced by corpus-based MT systems. This paper aims to develop a corpus-based bidirectional statistical machine translation (SMT) system for Punjabi-English, Punjabi-Hindi, and Hindi-English language pairs. To create a parallel corpus for English, Hindi, and Punjabi, the IIT Bombay Hindi-English parallel corpus is used. This paper discusses preprocessing steps to create the Hindi, Punjabi, and English corpus. This corpus is used to develop MT models. The accuracy of the MT system is carried out using an automated tool: Bilingual Evaluation Understudy (BLEU). The BLEU score claimed is 17.79 and 19.78 for Punjabi to English bidirectional MT system, 33.86 and 34.46.46 for Punjabi to Hindi bidirectional MT system, 23.68 and 23.78 for Hindi to English bidirectional MT system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call