Abstract

Cross-lingual embeddings facilitate cross-language learning, bridging the gap between rich-resource and low-resource languages. This study provides and assesses unsupervised cross-lingual embeddings generation methods for the low-resource Manipuri–English language pair. Manipuri is a resource-poor language spoken in India’s northeastern regions. The embeddings are evaluated on the language pair bilingual dictionary induction task. Furthermore, we propose a method to improve the cross-lingual embeddings by exploiting a temporally aligned comparable corpus. Lack of supervision has always been an issue for learning models, especially in low-resource settings. The proposed method takes advantage of the temporal alignments and provides the much-needed supervision to improve the alignment between Manipuri and English language pair. We observe that the proposed model consistently outperforms all the corresponding baselines from various experimental results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call