Abstract

Sentiment analysis, a hot research topic, presents new challenges for understanding users’ opinions and judgments expressed online. They aim to classify the subjective texts by assigning them a polarity label. In this paper, we introduce a novel machine learning framework using auto-encoders network to predict the sentiment polarity label at the word level and the sentence level. Inspired by the dimensionality reduction and the feature extraction capabilities of the auto-encoders, we propose a new model for distributed word vector representation “PMI-SA” using as input pointwise-mutual-information “PMI” word vectors. The resulted continuous word vectors are combined to represent a sentence. An unsupervised sentence embedding method, called Contextual Recursive Auto-Encoders “CoRAE”, is also developed for learning sentence representation. Indeed, CoRAE follows the basic idea of the recursive auto-encoders to deeply compose the vectors of words constituting the sentence, but without relying on any syntactic parse tree. The CoRAE model consists in combining recursively each word with its context words (neighbors’ words: previous and next) by considering the word order. A support vector machine classifier with fine-tuning technique is also used to show that our deep compositional representation model CoRAE improves significantly the accuracy of sentiment analysis task. Experimental results demonstrate that CoRAE remarkably outperforms several competitive baseline methods on two databases, namely, Sanders twitter corpus and Facebook comments corpus. The CoRAE model achieves an efficiency of 83.28% with the Facebook dataset and 97.57% with the Sanders dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call