Abstract

Sentiment analysis in social media has attracted significant attention. Although researchers have proposed many methods, a single method is hard to meet requirement in industrial applications. In this paper, based on massive data of Tencent and industrial practice, we present a multilayered analysis system (MAS) on social media. The system is composed of three sub-systems, including topic correlation calculation, topic-related sentence recognition and sentence polarity classification. Each sub-system is composed of several simple models. Also, we have set up a closed-loop feature mining and model updating system, which will continuously promote performance of MAS. In addition, this offline system requires very little intervention. The system, including online and offline parts, has been applied in several practical projects and obtained the best results in the evaluation of task 2 of SIGHAN-8.

Highlights

  • The popularity of Web 2.0 applications promotes the emergence of user generated content (UGC), e.g., the comments in blogosphere, and the UGC reflects the viewpoints of web users towards a specific event or product

  • We focus on sentiment analysis of short-text generated by users, for example, micro blog, news comment, products comment, tweets and so on

  • Many researchers have proposed many methods to improve the effect of sentiment analysis.Mei (2007) introduced latent sentimatic analysis model for sentiment analysis, e.g. LDA

Read more

Summary

Introduction

The popularity of Web 2.0 applications promotes the emergence of user generated content (UGC), e.g., the comments in blogosphere, and the UGC reflects the viewpoints of web users towards a specific event or product. Scholars have carried out a series of studies around these data, especially in the research of sentiment analysis. It aims to understand the subjective opinions of characters, events and other subjects based on the analysis of the content published by users. Based on KNN, they use Emoticons and hashtag to classify sentiment in tweets. Another significant effort for sentiment analysis is proposed by (Barbosa and Feng, 2010) who use polarity predicitions from three websites as noisy labels to train a SVM model. Based on massive data of Tencent, we propose an multilayered approach which integrates multiple simple methods. The result showed that both the precise and recall improved a lot

A Multilayered Anasysis System
Topic Correlation Calculation
Topic-Related Sentence Recognition
Sentence Polarity Classification
The Closed-loop Updating System
Algorithms
Features
Experiments
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.