Abstract

Embeddings are very popular representations that allow computing semantic and syntactic similarities between linguistic units from text co-occurrence matrix. Units can vary from character n-grams to words, including more coarse-grained units such as sentences and documents. Recently, multi-level embeddings combining representations from different units have been proposed as an alternative to single-level embeddings to account for the internal structure of words (i.e., morphology) and help systems to generalise well over out of vocabulary words. These representations, either pre-trained or learned, have shown to be quite effective, outperforming word-level baselines in several NLP tasks such as machine translation, part of speech tagging and named entity recognition. Our aim here is to contribute to this line of research proposing for the first time in Arabic NLP an in-depth study of the impact of various subwords configurations ranging from character to character n-grams (including word) for social media text classification. We propose several neural architectures to learn character, subword and word embeddings, as well as a combination of these three levels, exploring different composition functions to obtain the final representation of a given text. To evaluate the effectiveness of these representations, we perform extrinsic evaluations on three text classification tasks (sentiment analysis, emotion detection and irony detection) while accounting for different Arabic varieties (Modern Standard Arabic, dialects (Levantine and Maghrebi)). For each task, we experiment with well-known dialect-agnostic and dialect-specific datasets, including those that have been recently used in shared tasks to better compare our results with those reported in previous studies on the same datasets. The results show that the multi-level embeddings we propose outperform current static and contextualised embeddings as well as best performing state of the art models in sentiment and emotion detection. In addition, we achieve competitive results in irony detection. Our models are also the most productive across dialects observing that different dialects require different composition configurations. We finally show that these performances tend to increase when coupling the multi-level representations with task-specific features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call