Abstract

Cross-lingual text classification is a challenging task that aims to train classifiers with data in one language, known as the source language, and apply the acquired knowledge to data in another language, referred to as the target language. Recent advancements in multilingual pre-trained language models (PLMs) have made significant progress in addressing cross-lingual issues, and the application of prompt-based learning has further improved task performance. However, these models still face challenges such as the gap between cross-lingual classification tasks and pre-training tasks of PLMs, as well as issues related to scarce resources and data noise, which hinder the full exploitation of the implicit knowledge in PLMs. In this paper, we propose a Prompt-based Cross-lingual Learning (PCL) framework that combines language-agnostic continuous prompt learning with self-learning process. Specifically, PCL framework leverages language-agnostic prompts and PLMs to achieve semantic transfer between source and target languages. To enhance the semantic relationship between prompts and category labels, a label attention module is introduced. Additionally, a set of self-training rules is proposed, which includes a scoring function. In a few-shot setting, noisy data is dynamically filtered through scoring and ranking of the data. During each training iteration, both the model and scoring function weights are updated, further improving the discrimination capability of the model. In summary, the proposed PCL framework builds upon cross-lingual prompt learning, effectively removing noisy data and applying it to zero-shot cross-lingual text classification, which is beneficial for engineering applications. The findings of this study have implications for prompt learning method. The PCL framework achieves state-of-the-art performance in cross-lingual text classification task, with a 14% performance improvement compared to basic soft prompt learning. This demonstrates its potential in addressing classification problems in resource-limited scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call