Abstract

Due to the continuous and rapid growth of social media, opinionated contents are actively created by users in different languages about various products, services, events, and political parties. The automated classification of these contents prompted the need for multilingual sentiment analysis researches. However, the majority of research efforts are devoted to English and Arabic, English and German, English and French languages, while a great share of information is available in other languages such as Hausa. This paper proposes multilingual sentiment analysis of English and Hausa tweets using an Enhanced Feature Acquisition Method (EFAM). The method uses machine learning approach to integrate two newly defined Hausa features (Hausa Lexical Feature and Hausa Sentiment Intensifiers) and English feature to measure classification performance and to synthesize a more accurate sentiment classification procedure. The approach has been evaluated using several experiments with different classifiers in both monolingual and multilingual datasets. The experimental results reveal the effectiveness of the approach in enhancing feature integration for multilingual sentiment analysis. Similarly, by using features drawn from multiple languages, we can construct machine learning classifiers with an average precision of over 65%.

Highlights

  • Social media have turned the web into a vast source of information that is generated by users about all kinds of topics

  • This paper proposes multilingual sentiment analysis of English and Hausa tweets using an Enhanced Feature Acquisition Method (EFAM)

  • The method uses feature integration originating from two languages (English and Hausa) into a machine learning approach to multilingual sentiment analysis

Read more

Summary

Introduction

Social media have turned the web into a vast source of information that is generated by users about all kinds of topics. Due to the large volume of information, automated approaches that allow users to effectively interact with opinionated content [3] on the internet have been developed [4]. Such approaches form the field of sentiment analysis. Twitter users express their opinions in different languages such as Arabic, Spanish, German, French, and Hausa. This prompted the need for sentiment analysis systems that discover sentiment from a Twitter document made up of English and one other language.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call