Abstract

Nowadays, Big Data, both structured and unstructured data, are generated from Social Media. Social Media are powerful marketing tools and social big data require real-time tracking and analytics because the speed may indeed be the most important competitive business profits. Compared to batch processing of Sentiment Analysis on Big Data Analytics platform, Real-time analytic is data intensive in nature and require to efficiently collect and process large volume and high velocity of data. Real-time multiclass Sentiment Analysis is oriented towards classification of text into more detailed sentiment labels in real-time manner. But Multiclass Sentiment Analysis with Single-tier architecture where single classification model is developed and entire labeled data is trained may decrease the classification accuracy. In this paper, Real-time Multi-tier Sentiment Analysis system (RMSA) is proposed to achieve high level performance of multi-class classification in Real-time manner. Lexicon and learning based classification scheme with Multi-tier architecture are combined to develop the proposed system. Real-time twitter stream data is collected by apache flume and, large volumes and high velocity of social data is efficiently analyzed by Spark. To improve the classification accuracy, the suitable classifier is selected by comparing the accuracy of three different learning based multiclass classification techniques: Naive Bayes, Linear SVC and Logistic Regression. The evaluation results show that Real-time Multi-tier Sentiment Analysis will achieve the promising accuracy and Linear SVC is better than other techniques for Real-time Multi-tier Sentiment Analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call