Abstract

As an important issue in sentiment analysis, sentence-level polarity classification plays a critical role in many opinion-mining applications such as opinion question answering, opinion retrieval and opinion summarization. Employing a supervised learning paradigm to train a classifier from sentences often faces the data sparseness problem owing to the short-length limit introduced to texts. In this article, regarding this problem, we exploit two different feature sets learned from external data sets as additional features to enrich data representation: one is a latent topic feature set obtained using a topic model, and the other is a related word feature set derived using word embeddings. Furthermore, we propose an ensemble approach by using these additional features to guide the design of different members of the ensemble. Experimental results on the public movie review dataset demonstrate that the enriched representations are effective for improving the performance of polarity classification, and the proposed ensemble approach can further improve the overall performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call