Abstract

The number of messages that can be mined from online entries increases as the number of online application users increases. In Malaysia, online messages are written in mixed languages known as ‘Bahasa Rojak’. Therefore, mining opinion using natural language processing activities is difficult. This study introduces a Malay Mixed Text Normalization Approach (MyTNA) and a feature selection technique based on Immune Network System (FS-INS) in the opinion mining process using machine learning approach. The purpose of MyTNA is to normalize noisy texts in online messages. In addition, FS-INS will automatically select relevant features for the opinion mining process. Several experiments involving 1000 positive movies feedback and 1000 negative movies feedback have been conducted. The results show that accuracy values of opinion mining using Naïve Bayes (NB), k-Nearest Neighbor (kNN) and Sequential Minimal Optimization (SMO) increase after the introduction of MyTNA and FS-INS.

Highlights

  • It was reported on 30th of Jun 2011, 60.7% or 17.7 million Malaysians used Internet

  • The objective of this paper is to introduce a method to normalize noisy texts in Mixed Malay Language texts with the introduction of Malay Mixed Text Normalization Approach (MyTNA)

  • Both the training data and test data went through normalization process before the opinion mining process

Read more

Summary

Introduction

It was reported on 30th of Jun 2011, 60.7% or 17.7 million Malaysians used Internet. Facebook is the most favored application [1]. Communication sites such as blogger.com, mudah.com and Twitter were among the top 10 applications that Malaysians used on the Internet [2]. Subjective words that identify the private states may be identified using specific dictionary such as WordNet or SentiWordNet. At the beginning of this century, Pang, Lee and Vaithyanathan [5] started using machine learning approach to mine opinion. Lee and Vaithyanathan [5] successfully used text mining activities in mining opinion from 700 positive and 700 negative movie reviews. They concluded that additional activities to identify sentiment were required in opinion mining using the machine learning approach.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call