Abstract

Sentiment analysis is an active research area in natural language processing. The task aims at identifying, extracting, and classifying sentiments from user texts in post blogs, product reviews, or social networks. In this paper, the ensemble learning model of sentiment classification is presented, also known as CEM (classifier ensemble model). The model contains various data feature types, including language features, sentiment shifting, and statistical techniques. A deep learning model is adopted with word embedding representation to address explicit, implicit, and abstract sentiment factors in textual data. The experiments conducted based on different real datasets found that our sentiment classification system is better than traditional machine learning techniques, such as Support Vector Machines and other ensemble learning systems, as well as the deep learning model, Long Short-Term Memory network, which has shown state-of-the-art results for sentiment analysis in almost corpuses. Our model’s distinguishing point consists in its effective application to different languages and different domains.

Highlights

  • Sentiment classification or opinion mining is the narrow field of natural language processing, information querying, and text mining that is used to extract a person’s impressions of or thoughts about something from nonstructural text data

  • We propose an effective ensemble learning system using datasets of base learners, which comprise features that result from exploring language characteristics and applying a deep learning model

  • Ensemble learning of training sets containing ‘deep feature’ and features relevant to polarity shifting of ‘surface feature’ type leads to classification results better than the deep learning state-of-the-art model in sentiment classification (LSTM)

Read more

Summary

Introduction

Sentiment classification or opinion mining is the narrow field of natural language processing, information querying, and text mining that is used to extract a person’s impressions of or thoughts about something from nonstructural text data. Polarity shifting can make traditional approaches, such as machine learning with the Bag-of-Words (BoW) model, ineffective because these approaches are interested in whether single words bear positive or negative polarity, based on a predetermined sentiment dataset. This paper adopts ensemble learning to classify sentiment at the document level inspired by [5,32]. We extract various different features from datasets for base learners by identifying the various structures that cause polarity shifting in the text; we call these ‘surface features’. We propose an effective ensemble learning system using datasets of base learners, which comprise features that result from exploring language characteristics and applying a deep learning model. We adopt word embedding and develop a deep learning model for base learners that helps improve the system’s effectiveness. The remainder of the paper is organized as follows: Section 2 presents related existing work, Section 3 presents the proposal of our model, Section 4 describes the experiments and valuations, and Section 5 presents the conclusion and introduces directions for future research

Related Work
Extracting ‘Surface Feature’
Extracting
Base learners and Meta-Learner
Ensemble
Vietnamese Language
English
Evaluation
Conclusions and Future Research
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call