A novel self-supervised sentiment classification approach using semantic labeling based on contextual embeddings

Mousa Alizadeh,Azam Seilsepour

doi:10.1007/s11042-024-19086-y

Abstract

AbstractSentiment Analysis (SA) is a domain or context-oriented task since the sentiment words convey different sentiments in various domains. As a result, the domain-independent lexicons cannot correctly recognize the sentiment of domain-dependent words. To address this problem, this paper proposes a novel self-supervised SA method based on semantic similarity, contextual embedding, and Deep Learning Techniques. It introduces a new Pseudo-label generator that estimates the pseudo-labels of samples using semantic similarity between the samples and their sentiment words. It proposes two new concepts to calculate semantic similarity: The Soft-Cosine Similarity of a sample with its Positive words (SCSP) and the Soft-Cosine Similarity of a document with its Negative words (SCSN). Then, the Pseudo-label generator uses these concepts and the number of sentiment words to estimate the label of each sample. Later on, a novel method is proposed to find the samples with highly accurate pseudo-labels. Finally, a hybrid classifier, composed of a Convolutional Neural Network (CNN) and a Gated Recurrent Unit (GRU), is trained using these highly accurate pseudo-labeled data to predict the label of unseen data. The comparison of the proposed method with the lexicons and other similar existing methods demonstrates that the proposed method outperforms them in terms of accuracy, precision, recall, and F1 score.

Full Text