Abstract

This paper reports our submissions to Task 4, i.e., Sentiment Analysis in Twitter (SAT), in SemEval 2016, which consists of five subtasks grouped into two levels: (1) sentence level, i.e., message polarity classification (subtask A), and (2) topic level, i.e., tweet classification and quantification according to two-point scale (subtask B and D) or five-point scale (subtask C and E). We participated in all these five subtasks. To address these subtasks, we investigated several traditional Natural Language Processing (NLP) features including sentiment lexicon, linguistic and domain specific features, and word embedding features together with supervised machine learning methods. Officially released results showed that our systems rank above average.

Highlights

  • In recent years, with the emergence of social media, more and more users have shared and obtained information through microblogging websites, such as Twitter

  • SemEval 2016 provides such a universal platform for researchers to explore in the task of Sentiment Analysis in Twitter (Nakov et al, 2016) (Task 4), which includes five subtasks grouped into two levels, i.e., sentence level and topic level

  • Subtask A is a sentence level task aiming at sentiment polarity classification of the whole tweet

Read more

Summary

Introduction

With the emergence of social media, more and more users have shared and obtained information through microblogging websites, such as Twitter. SemEval 2016 provides such a universal platform for researchers to explore in the task of Sentiment Analysis in Twitter (Nakov et al, 2016) (Task 4), which includes five subtasks grouped into two levels, i.e., sentence level and topic level. Subtask A is a sentence level task aiming at sentiment polarity classification of the whole tweet. Subtask B is to identify the sentiment polarity label (i.e, Positive and Negative) of tweets with respect to the given topic while subtask D aims at estimating the sentiment distribution of tweets with respect to the given topic. Both subtask B and D are on a two-point scale. The purposes of subtask C and E are similar with that of subtask B and D, except for using a five-point scale, that is, the class labels are of 5 values, i.e, 2, 1, 0, -1 and -2 representing Very Positive, Positive, Neutral, Negative and Very Negative

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call