Abstract

Machine translation for low-resource language can be improved using various techniques. One such technique is the application of knowledge learned by training a model with high-resource language pair to another model with a low-resource language pair. The paper discusses the experiments and improvement of the results of neural machine translation using transfer learning for the English-Khasi language pair. Long short-term memory is used as the backbone architecture for the transfer learning model. The essential technique is the shared vocabulary, constructed utilizing the subword unit of byte pair encoding of the two pairs of languages and the subword unit of byte pair encoded datasets. Analysis and evaluation of the experimental output using human subjective evaluation, statistical evaluation, and automatic evaluation show positive results for the transfer learning system. A thorough analysis of word order agreement and comparisons of the outputs between the baseline and the transfer learning system is made. The analysis and evaluation methods portray that neural machine translation using transfer learning improves the translation accuracy for Khasi, a low-resource language.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call