Abstract

Currently, the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon. Thus, this issue is addressed in this paper in order to enhance the accuracy of sentiment analysis. In this study, a new lexicon for sentiment analysis is constructed. A detailed review of existing approaches has been conducted, and a new bilingual sentiment lexicon known as MELex (Malay-English Lexicon) has been generated. Constructing MELex involves three activities: seed words selection, polarity assignment, and synonym expansions. Our approach differs from previous works in that MELex can analyze text for the two most widely used languages in Malaysia, Malay, and English, with the accuracy achieved, is 90%. It is evaluated based on the experimentation and case study approaches where the affordable housing projects in Malaysia are selected as case projects. This finding has given an implication on the ability of MELex to analyze public sentiments in the Malaysian context. The novel aspects of this paper are two-fold. Firstly, it introduces the new technique in assigning the polarity score, and second, it improves the performance over the classification of mixed language content.

Highlights

  • This paper presented a new bilingual sentiment lexicon known as MELex (Malay-English lexicon), covering two main languages in Malaysia: Malay and English, in mining public opinion

  • A new approach in constructing a bilingual sentiment lexicon using the combination of term frequency and word vector representation has been proposed

  • We prove the effectiveness of the classification via experiments using the newly constructed bilingual and domain-specific sentiment lexicon known as MELex

Read more

Summary

Introduction

Social network sites like Facebook, Twitter, and blogs have changed the way people communicate and the way they connect. The new challenge is how to process and interpret this massive. CMC, 2022, vol., no.1 amount of information available in social media. This challenge is the object of research in the discipline called “sentiment analysis.”. Sentiment analysis is one of the most active research areas in Natural language processing (NLP) since early 2000 [1]. With the need for globalization, it is common to see the post written in multiple languages, making the sentiment analysis process even more complex and challenging. In unstructured content like Twitter posts, people tend to mix languages in one sentence. According to Lo, Cambria, Chiong, and Cornforth [5], specific information in other languages might miss out if the analysis is done for a single language only

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call