Abstract

The utilization of sentiment analysis has gained significant importance as a valuable method for obtaining meaningful insights from textual data. The research progress in languages such as English and Chinese has been notable. However, there is a noticeable dearth of attention towards creating tools for sentiment analysis in the Bangla language. Currently, datasets are limited for Bangla sentiment analysis, especially balanced datasets capturing both binary and multiclass sentiment for e-commerce applications. This paper introduces a new sentiment analysis dataset from the popular Bangladeshi e-commerce site “Daraz”. The dataset contains 1000 reviews across 5 product categories, with both binary (positive/negative) and multiclass (very positive, positive, negative, very negative) sentiment labels manually annotated by native Bangla speakers. Reviews were collected using an organized process, and labels were assigned based on standardized criteria to ensure accuracy. In addition, a benchmark evaluation of the performance achieved by Machine Learning and Deep Learning algorithms on this dataset is also provided. The new dataset can aid research on multiclass and binary Bangla sentiment analysis utilizing both machine learning, deep learning, and Large Language Models. It can aid e-commerce platforms in analysing nuanced user opinions and emotions from online reviews. The utilization of categorized product reviews also facilitates research in the field of text categorization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call