Abstract

AbstractIn recent years, there has been a lot of interest in cross-language sentiment classification, as the research in sentiment analysis has shifted focus from English to less resourceful languages. Cross-language sentiment classification attempts to leverage the automated machine translation (MT) capability utilizing the infrastructure of languages rich in linguistic resources, mainly English, to help build sentiment analysis systems for low-resource languages. In this study, we explore how MT influences cross-language sentiment classification. To this end, we perform three different experiments, obtaining promising results. In the first experiment, we automatically translate 4,000 positive and negative reviews from English into Greek and Italian, thus obtaining labeled sentiment datasets in these languages. Then, we train a Naive Bayes classifier and compare the performance with the source dataset. In the second experiment, the translated reviews are automatically translated back into the source language (English), aiming to compare the classification accuracy with the one obtained in the original dataset. In the final approach, the reviews are translated from the source (English) into Italian through an intermediate translation in Greek to examine whether the performance was further diminished compared with the approach of the first experiment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call