The OL-DAWE Model: Tweet Polarity Sentiment Analysis With Data Augmentation

Wenhuan Wang,Anman Zhang,Ding Feng,Bohan Li,Shuo Wan

doi:10.1109/access.2020.2976196

Wenhuan Wang, Anman Zhang + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.2976196

Copy DOI

Abstract

Introducing negative items into sentences can shift the polarity of emotional words and leads to misclassification. Therefore, dealing with the negative item is indispensable to the analysis of the polarity of tweets. This paper first uses the combination of Conjunction Analysis (CA) technology and Punctuation Mark Identification (PMI) technology to detect negation cue and its scope. Besides, we propose the OL-DAWE model, which uses Data Augmentation(DA) technology to generate opposed tweets according to the original tweet. The model extends the training data set, and test data set and learns the original and opposed sides of the tweet in the training module. When predicting the polarity of tweets, the OL-DAWE model considers the positive degree (negative degree) of the original tweet and the negative degree (positive degree) of its opposed tweet. We conduct experiments on two real-world data sets. We prove the effectiveness of our combined technology in negation processing and show that the OL-DAWE model in the polarity sentiment analysis of tweets is better than the baseline for its simplicity and high efficiency.

Highlights

Emerging of various social media and commercial websites has encouraged people to express their opinions on multiple platforms
Since long-distance negation is often associated with conjunctions, we propose a combination of Punctuation Mark Identification (PMI) technology and Conjunction Analysis (CA) technology to reduce the impact of negation on polarity shift
OL-DAWE Model: We proposed a comparative learning model based on Data Augmentation and Word Embedding, which includes the combination of PMI and CA technology and the six rules we defined

Summary

INTRODUCTION

Emerging of various social media and commercial websites has encouraged people to express their opinions on multiple platforms. Many efforts have been put into improving the accuracy of classification [3], [4], most of the methods have little effect because of the inherent difficulty of Word Embedding which called polarity shift. Most of the sentiment polarity analysis methods on tweets are deficient in two aspects: 1) ignore the importance of negation cues and their scope; 2) ignore the emotional comparability between positive and negative tweets. The OL-DAWE model uses Word Embedding technology to learn the two opposing sides of a tweet, which obtained through DA and utilizes the polarity comparison between the original tweets and the opposed tweets to improve the prediction accuracy and robustness. To the best of our knowledge, this paper uses DA for the first time to apply opposed training and prediction to tweets polarity sentiment analysis. Because of the random patterns of tweets and their relevance to human descriptions, to apply DA in tweets scenarios is much harder for us than canonical text scenarios

ORGANIZATION The remainder of this paper is organized as follows

EXPERIMENTS

Findings

CONCLUSION AND FUTURE WORK