Social media has been a real-world sensor to observe the pulse of society. Although it provides unique communication opportunities, it also brings along vital challenges. One of them is hate speech, which attacks a single individual or targeted groups. Previous researchers claim that among social networks, the Twitter platform is mostly used to spread hate speech. However, data on a larger scale makes it hard to capture and understand the nature of hate speech on Twitter within specific food brands. In this study, sentiment analysis techniques were used to filter hate speech on Twitter and on three popular food brands in Malaysia. This study was conducted in five phases, namely raw data collection, pre-processing, sentiment analysis, visualization, and performance evaluation. This corpus consists of 28,898 data samples based on user tweet searches. A Twitter API was created and SQLite was used to store all the sample data. VADER sentiment analysis is used to classify tweet sentiment into positive, negative, and neutral. In the visualization phase, these three food brands are visualized using a histogram to gain sentiment analysis insights. Then, three machine learning methods were implemented to predict the best model for sentiment analysis. The Decision Tree classifier outperforms the average accuracy in Support Vector Machine and Logistic Regression with 99.99% for the training data set. This study provides insights to assist humans in making decisions. With the growth of opinions expressed in multimedia on social media, such as spoken feedback on Twitter, sentiment analysis has the potential to become a more news aggregation and low-cost endeavour.
Read full abstract