Abstract

Machine translation (MT) aims to remove linguistic barriers and enables communication by allowing languages to be automatically translated. The availability of a substantial parallel corpus determines the quality of translations produced by corpus-based MT systems. This paper aims to develop a corpus-based bidirectional statistical machine translation (SMT) system for Punjabi-English, Punjabi-Hindi, and Hindi-English language pairs. To create a parallel corpus for English, Hindi, and Punjabi, the IIT Bombay Hindi-English parallel corpus is used. This paper discusses preprocessing steps to create the Hindi, Punjabi, and English corpus. This corpus is used to develop MT models. The accuracy of the MT system is carried out using an automated tool: Bilingual Evaluation Understudy (BLEU). The BLEU score claimed is 17.79 and 19.78 for Punjabi to English bidirectional MT system, 33.86 and 34.46.46 for Punjabi to Hindi bidirectional MT system, 23.68 and 23.78 for Hindi to English bidirectional MT system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.