Abstract

Sentiment Analysis or opinion mining refers to a process of identifying and categorizing the subjective information in source materials using natural language processing (NLP), text analytics and statistical linguistics. The main purpose of opinion mining is to determine the writer’s attitude towards a particular topic under discussion. This is done by identifying a polarity of a particular text paragraph using different feature sets. Feature engineering in pre-processing phase plays a vital role in improving the performance of a classifier. In this paper we empirically evaluated various features weighting mechanisms against the well-established classification techniques for opinion mining, i.e. Naive Bayes-Multinomial for binary polarity cases and SVM-LIN for multiclass cases. In order to evaluates these classification techniques we use Rotten Tomatoes publically available movie reviews dataset for training the classifiers as this is widely used dataset by research community for the same purpose. The empirical experiment concludes that the feature set containing noun, verb, adverb and adjective lemmas with feature-frequency (FF) function perform better among all other feature settings with 84% and 85% correctly classified test instances for Naïve Bayes and SVM, respectively.

Highlights

  • Sentiment analysis or opinion mining is a process of recognizing and categorizing people’s sentiments, opinions, attitudes, and emotions from the text written in natural language

  • Because of the proliferation of text data on web and social media, opinion mining has gained a lot of attention and has in this way turned into an active research area in natural language processing Natural Language Processing (NLP), which exploits systems and techniques from data mining

  • Natural Language Processing (NLP) is a framework to support an interaction between computers and human languages by providing processing capability of a text written in natural language using the methods and techniques stemming from various fields like computer science, computational linguistics and artificial intelligence

Read more

Summary

Introduction

Sentiment analysis or opinion mining is a process of recognizing and categorizing people’s sentiments, opinions, attitudes, and emotions from the text written in natural language. Millions of messages are posted on social media like twitter, rotten tomatoes and Facebook These messages cover numerous topics including public opinion about various topics such as products, current affairs, politics, and movies and so on. Research in natural language processing is focused towards soft and probabilistic predictions based on assigning weight to all features Such models have an advantage of expressing the relative certainty of many different possible answers rather than only one, they provide more reliable and accurate results when such kind of a model is included as a component of larger systems some of these algorithms are Naive Bayes, Maximum Entropy Measure and SVM [6]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call