Abstract

Sentiment analysis is a task that deals with the automatic extraction of sentimental contents expressed in written text. Several approaches in sentiment analysis are based on machine learning techniques, more specifically classifiers that are trained on labeled datasets. In this context, many Natural Language Processing (NLP) tasks are usually employed as a preprocessing step to help improve the quality of the data and to convert them into forms appropriate for the subsequent classification process. Several studies on sentiment analysis in the literature have already performed some evaluation of NLP tasks and/or classification. However, the vast majority of them did not work with texts in the Brazilian Portuguese language and the analyzes did not consider the combination of sets of preprocessing tasks with classifiers. Therefore, in this work, we evaluate the combination of five NLP tasks and three classifiers in the domain of sentiment analysis using texts written in Portuguese. The experimental results showed that different combinations of preprocessing tasks can significantly affect the predictive performance of a classifier for a given dataset. Thus, it is clear the importance of performing the joint evaluation of preprocessing tasks with classifiers when choosing which preprocessing tasks and classifiers should be used for a dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call