Abstract

Fine-Grained Sentiment Analysis (FGSA) of Malayalam Tweets have been implemented in this work. The tweets are classified into positive, strongly positive, negative, strongly negative, and neutral sentiments. Both lexicon-based and machine learning-based approaches are used for sentiment classification of Malayalam Tweets. Lexicon based approach uses both dictionary-based and corpus-based approach. The dictionary-based approach is used in this work. The machine learning algorithms such as Support Vector Machine (SVM) and Random Forest (RF) classifiers are used for sentiment classification of the dataset. Bag of Words (BoW), Term-Frequency vs. Inverse Document Frequency (TF-IDF), and Sentiwordnet feature matrices are used to vectorize the input dataset. Lexicon based approach got an accuracy of 84.8%. In machine learning algorithms, the SVM (kernel = linear), SVM (kernel = RBF) and RF with the Sentiwordnet feature vector got an accuracy of 92.6%, 92.9%, and 93.4%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call