Sentiment Classification for Film Reviews in Gujarati Text Using Machine Learning and Sentiment Lexicons

Parita Shah,Priya Swaminarayan,Maitri Patel

doi:10.5614/itbj.ict.res.appl.2023.17.1.1

Parita Shah, Priya Swaminarayan + Show 1 more

Open Access

https://doi.org/10.5614/itbj.ict.res.appl.2023.17.1.1

Copy DOI

Abstract

In this paper, two techniques for sentiment classification are proposed: Gujarati Lexicon Sentiment Analysis (GLSA) and Gujarati Machine Learning Sentiment Analysis (GMLSA) for sentiment classification of Gujarati text film reviews. Five different datasets were produced to validate the machine learning-based and lexicon-based methods’ accuracy. The lexicon-based approach employs a sentiment lexicon known as GujSentiWordNet, which identifies sentiments with a sentiment score for feature generation, while in the machine learning-based approach, five classifiers are used: logistic regression (LR), random forest (RF), k-nearest neighbors (KNN), support vector machine (SVM), naive Bayes (NB) with TF-IDF, and count vectorizer for feature selection. Experiments were carried out and the results obtained were compared using accuracy, precision, recall, and F-score as performance evaluation criteria. According to the test results, the machine learning-based technique improved accuracy by 3 to 10% on average when compared to the lexicon-based approach.

Full Text