Word clustering based on POS feature for efficient twitter sentiment analysis

Yili Wang,Byungjun Lee,Kyungtae Kim,Hee Yong Youn

doi:10.1186/s13673-018-0140-y

Yili Wang, Byungjun Lee + Show 2 more

Open Access

https://doi.org/10.1186/s13673-018-0140-y

Copy DOI

Abstract

With rapid growth of social networking service on Internet, huge amount of information are continuously generated in real time. As a result, sentiment analysis of online reviews and messages has become a popular research issue [1]. In this paper a novel modified Chi Square-based feature clustering and weighting scheme is proposed for the sentiment analysis of twitter message. Along with the part of speech tagging, the discriminability and dependency of the words in the tagged training dataset are taken into account in the clustering and weighting process. The multinomial Naïve Bayes model is also employed to handle redundant features, and the influence of emotional words is raised for maximizing the accuracy. Computer simulation with Sentiment 140 workload shows that the proposed scheme significantly outperforms four existing representative sentiment analysis schemes in terms of the accuracy regardless of the size of training and test data.

Highlights

Massive volume of data are generated and shared through internet [2,3,4]
In this paper a novel feature weighting approach is proposed, which is inspired by the expectation that enhancing the strength of the words of strong discriminability may allow higher accuracy of sentiment analysis [22, 23]
Twitter sentiment analysis has become a promising technique for industry and academia

Summary

Introduction

Massive volume of data are generated and shared through internet [2,3,4]. There exist various forms with the data originated from internet, and especially text is quite popular for expressing and sharing information between individual users. In this paper a novel feature weighting approach is proposed, which is inspired by the expectation that enhancing the strength of the words of strong discriminability may allow higher accuracy of sentiment analysis [22, 23]. A novel feature reduction method is proposed to reduce the dimensionality (size of features) [26], which omits irrelevant data in classifying the training dataset into a small number of features and achieves a reasonable computational complexity when weighting the words [27, 28]. A novel composite feature weighting technique is proposed, which considers the dependency derived using the modified Chi Square technique and discriminability of the clustered feature set.

Related work

E11 E12 E21 E22

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Human-centric Computing and Information Sciences	Publication Date: Jun 18, 2018
Citations: 25	License type: open-access

R Discovery Prime

R Discovery Prime

Word clustering based on POS feature for efficient twitter sentiment analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Human-centric Computing and Information Sciences

Lead the way for us

Similar Papers

A Novel Feature-Based Text Classification Improving the Accuracy of Twitter Sentiment Analysis
Yili Wang ... Yuhui Zheng
-
Yili Wang, et. al.Yili Wang ... Yuhui Zheng
20 Dec 2017
20 Dec 2017

Sentiment Analysis of Chinese Online Reviews Based on Word2vec and DBN
Sai-Hong Zeng ... Chao-Fan Dai
DEStech Transactions on Computer Science and Engineering | VOL. -
Sai-Hong Zeng, et. al.Sai-Hong Zeng ... Chao-Fan Dai
21 Jun 2017
DEStech Transactions on Computer Science and Engineering | VOL. -

Sentiment Analysis of Reviews Using Bi-LSTM Using a Fine-Grained Approach
Rishika Garg ... Praveen Singh
-
Rishika Garg, et. al.Rishika Garg ... Praveen Singh
10 Nov 2022
10 Nov 2022

An Unsupervised Aspect Detection Model for Sentiment Analysis of Reviews
Ayoub Bagheri ... Mohamad Saraee
-
Ayoub Bagheri, et. al.Ayoub Bagheri ... Mohamad Saraee
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Word clustering based on POS feature for efficient twitter sentiment analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Human-centric Computing and Information Sciences