<p>In recent years, transformer models have achieved great success in natural language processing (NLP) tasks. Most of the current results are achieved by using monolingual transformer models, where the model is pre-trained using a single-language unlabelled text corpus. Then, the model is fine-tuned to the specific downstream task. However, the cost of pre-training a new transformer model is high for most languages. In this work, we propose a cost-effective transfer learning method to adopt a strong source language model, trained from a large monolingual corpus to a low-resource language. Thus, using the XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for the low-resource language Tigrinya. With only 10k examples of the given Tigrinya sentiment analysis dataset, English XLNet achieved 78.88% F1-Score, outperforming BERT and mBERT by 10% and 7%, respectively. More interestingly, fine-tuning (English) XLNet model on the CLS dataset showed promising results compared to mBERT and even outperformed mBERT for one dataset of the Japanese language.</p>
Read full abstract