Multi-Objective Model Selection (MOMS)-based Semi-Supervised Framework for Sentiment Analysis

Farhan Hassan Khan,Usman Qamar,Saba Bashir

doi:10.1007/s12559-016-9386-8

Abstract

Sentiment analysis has emerged as an active research field due to the rapid growth of user-generated content on the Internet. This research area analyzes the opinions and attitudes of masses toward products, movies, topics, individuals, and services. Various machine learning and text mining algorithms have been used for sentiment analysis and classification. The recent research concludes that domain-specific lexicons perform significantly better as compared to domain-independent lexicons. The proposed research aims at improving the performance of general-purpose lexicons utilizing machine learning algorithms. A semi-supervised framework based on “MOMS” is introduced in order to determine the feature weight by incorporating SentiWordNet, a well-known general-purpose sentiment lexicon. The feature weights are learned by support vector machine, and the classification performance is enhanced by using Multi-Objective Model Selection procedure. Subjectivity criterion is used to select the desired features, and the effects of feature selection with respect to their part-of-speech information are studied comprehensively. Experimental evaluation is performed on seven different benchmark datasets which includes Large movie review dataset, Multi-domain sentiment dataset, and Cornell movie review dataset. The comparison of the proposed approach is performed with state-of-the-art techniques, lexicon-based approaches, and other methods for sentiment analysis. The proposed framework results in high performance when compared to other research in this field.

Full Text