Abstract

Now-a-days, people use social media platforms such as Facebook, Twitter, and Instagram to share their opinions on particular entities or services. The sentiment analysis can get the polarity of these opinions, especially in the political domain. However, in Malaysia, current sentiment analysis can be inaccurate when the netizen tempts to use the combination of Malay words in their comments. It is due to the insufficient Malay corpus and sentiment analysis tools. Therefore, this study aims to construct a multistage sentiment classification model based on Malaysia Political Ontology and Malay Political Corpus. The reviews are carried out in sentiment analysis, classification techniques, Malay sentiment analysis, and sentiment analysis on politics. It starts with the data preparation for Malay tweets to produce tokenized Malay words and then, the construction of corpus using corpus filtering, web search, and filtering using linguistic patterns before enhancing with political lexicons. The process continues with the classifier construction. It started with a generic ontology with Malaysia's political context. Lastly, twelve features are identified. Then the extracted features are tested using different classifiers. As a result, Linear Support Vector Machine yields an accuracy of 86.4% for the classification. It proved that the multistage sentiment classification model improved the Malay tweets classification in the political domain.

Highlights

  • Social media is a common platform for internet users

  • The main idea of this study is to propose a multistage sentiment classification model using Malaysia Political Ontology and Malay Political Corpus

  • The politicians and political parties are classified into government or opposition using Malaysia Political Ontology (MPO)

Read more

Summary

Introduction

Social media is a common platform for internet users. Netizens can spread and viral issues quickly via social media like Facebook, Twitter, Blog, Instagram, and online platforms. The existing sentimental analysis classifiers manage to analyze different languages such as English, French, Indian, Arabic, and Chinese. It has yet insufficiently in analyzing the Malay language accurately. Each comment containing Malay words will be classified as neutral in most of the social media monitoring tools. It is one of the reasons for the Malay sentiment classifier to support the research on classifying the Malay language, which use lexicon and knearest neighbor [1], lexicon [2] and other classification methods [3]. There is lacking Malay sentiment analysis that covers the political domain [4]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call