Abstract

Numerous domain adaptation methods have been proposed over the last decade, of which the most widely used methods have become popular owing to their generality in terms of tasks or language. While generality fails to consider language-specific issues, sentiment-specific adaptation methods rely on language-specific high-quality resources such as tagging tools or sentiment lexicons. This study proposes a resource-free unsupervised self-labeling adaptation framework for Arabic sentiment classification. By leveraging the sentiment-specific task of lexicon induction using a combination of feature selection methods and an improved hybrid word pairwise similarity technique, the proposed framework proved to be less sensitive to the issue of Arabic feature sparsity. A total of 12 traditional and 12 transformer-based experiments on two Arabic multi-domain datasets adapted in the proposed framework demonstrated that a simple yet effective unsupervised self-labeling approach outperformed complex representation learning adaptation approaches for the Arabic language. The proposed framework showed an improvement over the best-performing method by 2% on a dataset of reviews and competitive results on a dataset of tweets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call