Abstract

Understanding the sentiment conveyed by a person is a crucial task in any social interaction. Moreover, it can be used to gain insight and understanding of views held by many people. Sentiment classification is not limited to human interaction, as text can also convey the sentiment of the author. Opinion mining in text is a long studied field in machine learning. This study focuses on two of the many text domains used in the field of sentiment analysis: reviews and tweets. In this study, we aim to determine the the effect of performing cross-domain sentiment classification using either reviews or tweets as training data. We conduct an empirical investigation using two tweet datasets and one review dataset, and three classifiers. We conduct 18 experiments, varying the training dataset, the classifier used to build the model, and the dataset used to evaluate the model built. Our results show that training with tweets, for both datasets, yields an effective classifier for reviews. However, the converse, using reviews to classify sentiment in tweets, has the worst performance of all models, producing AUC values ranging from 0.59 to 0.65. Our best model is generated using tweets to train a Multinomial Naive Bayes classifier, and using reviews to evaluate. Multinomial Naive Bayes was the best performing learner, producing the highest AUC in 5 out of the 6 combinations of training/test datasets. To the best of our knowledge, this study is the first to examine the effects of cross-domain sentiment classification using tweets and reviews.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call