Abstract

Vector space word representations have gained big success recently at improving performance across various NLP tasks. However, existing word embeddings learning methods only utilize homo-lingual corpus. Inspired by transfer learning, we propose a novel language transfer method to obtain word embeddings via language transfer. Under this method, in order to obtain word embeddings of one language (target language), we train models on corpus of another different language (source language) instead. And then we use the obtained source language word embeddings to represent target language word embeddings. We evaluate the word embeddings obtained by the proposed method on word similarity tasks across several benchmark datasets. And the results show that our method is surprisingly effective, outperforming competitive baselines by a large margin. Another benefit of our method is that the process of collecting new corpus might be skipped.KeywordsTarget WordLanguage ModelTarget LanguageBenchmark DatasetTransfer LearningThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.