Abstract

The present paper describes design of an online hybrid machine translation (MT) system involving a low-resource language Khmer, the official language of Cambodia. The proposed system uses an open source statistical machine translation (SMT) toolkit DoMY CE as the primary translation tool. The parallel corpora have been prepared from various sources and the Headley Khmer-English dictionary. Language model, translation model and decoder configurations have been done using the DOMY toolkit. We used a post-processing step of using parts of speech tagger to enhance the quality of target language sentence. Experimental results demonstrate the success of the proposed scheme with English as source and Khmer as the target language. In our experiments the proposed model achieved significantly good National Institute of Standards and Technology (NIST) and BiLingual Evaluation Understudy (BLEU) scores. Different web technologies have been used for developing an online translation system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call