SiTAKA at SemEval-2017 Task 4: Sentiment Analysis in Twitter Based on a Rich Set of Features

Mohammed Jabreel,Antonio Moreno

doi:10.18653/v1/s17-2115

Abstract

This paper describes SiTAKA, our system that has been used in task 4A, English and Arabic languages, Sentiment Analysis in Twitter of SemEval2017. The system proposes the representation of tweets using a novel set of features, which include a bag of negated words and the information provided by some lexicons. The polarity of tweets is determined by a classifier based on a Support Vector Machine. Our system ranks 2nd among 8 systems in the Arabic language tweets and ranks 8th among 38 systems in the English-language tweets.

Highlights

Sentiment analysis in Twitter is the problem of identifying people’s opinions expressed in tweets
This paper proposes the representation of tweets using a novel set of features, which include the information provided by seven lexicons and a bag of negated words (BonW)
The evaluation metrics used by the task organizers were the macroaveraged recall (ρ), the F1 averaged across the positives and the negatives F 1P N and the accuracy (Acc) (Rosenthal et al, 2017)

Summary

Introduction

Sentiment analysis in Twitter is the problem of identifying people’s opinions expressed in tweets. The success of the Machine Learning models is based on two main facts: a large amount of labeled data and the intelligent design of a set of features that can distinguish between the positive, negative and neutral samples. With this approach, most studies have focused on designing a set of efficient features to obtain a good classification performance (Feldman, 2013; Liu, 2012; Pang and Lee, 2008). This paper proposes the representation of tweets using a novel set of features, which include the information provided by seven lexicons and a bag of negated words (BonW). In the last section the conclusions as well as further work are presented

Resources

En-SiTAKA Lexicons

Ar-SiTAKA Lexicons

Embeddings

Preprocessing and Normalization

Features Extraction

Syntactic Features

Lexicon Features

Cluster Features

Embedding Features

Classifier

Results

Conclusion

INGEOTEC

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SiTAKA at SemEval-2017 Task 4: Sentiment Analysis in Twitter Based on a Rich Set of Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2017
Citations: 43	License type: cc-by

Similar Papers

TakeLab at SemEval-2017 Task 4: Recent Deaths and the Power of Nostalgia in Sentiment Analysis in Twitter
Doria Šarić ... Ivan Tokić
-
Doria Šarić, et. al.Doria Šarić ... Ivan Tokić
01 Jan 2017
TakeLab at SemEval-2017 Task 4: Recent Deaths and the Power of Nostalgia in Sentiment Analysis in Twitter
Doria Šarić ... Ivan Tokić

Like It or Not
Fabio Crestani ... Anastasia Giachanou
ACM Computing Surveys | VOL. 49
Fabio Crestani, et. al.Fabio Crestani ... Anastasia Giachanou
30 Jun 2016
ACM Computing Surveys | VOL. 49

Sentiment Analysis in Twitter for Spanish
Ferran Pla ... Lluís-F Hurtado
-
Ferran Pla, et. al.Ferran Pla ... Lluís-F Hurtado
01 Jan 2014
01 Jan 2014

LSIS at SemEval-2017 Task 4: Using Adapted Sentiment Similarity Seed Words For English and Arabic Tweet Polarity Classification
Amal Htait ... Sébastien Fournier
-
Amal Htait, et. al.Amal Htait ... Sébastien Fournier
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SiTAKA at SemEval-2017 Task 4: Sentiment Analysis in Twitter Based on a Rich Set of Features

Abstract

Highlights

Summary

Talk to us

Similar Papers