Abstract

Social media users are more receptive to products or events and share their thoughts through raw textual data, which is classified as semi-structured data. This data, which is presented using a variety of terminologies, is noisy by nature but yet contains important information and superfluous details, giving analysts a way to identify patterns and knowledge. This hidden information must be extracted from language data in order to make informed decisions and create strategic plans for entering new markets. Among the most prominent fields of study are natural language processing (NLP) and data mining techniques, especially when it comes to sentiment analysis—the process of identifying the feelings and insights concealed in the data. Twitter is one of the significant microblogging platform with millions of users. These users use Twitter to share sentiments using hash tags on different topics and to make status updates known as tweets. Twitter is therefore regarded as a significant real-time source and as one of the most active opinion indicators. The volume of information is produced by Twitter is enormous and manually scanning the entire data set is difficult process. The paper proposed an ensemble classifier to categorize emotion of the tweets on the basis of polarities such as positive and negative.In our study, we ensemble classifiers which is a combination of Random Forest (RF), Support Vector Machine (SVM) and Decision Tree (DT). The data is collected from Twitter API and the Twitter data is analysed autonomously to define public view on particular topic. The features obtained after the process of dimensionality reduction using LDA undergoes the stage of feature selection using Wrapper based technique. The iterative Wrapper based technique predict score for the features, the features with low score are ignored and high score is proceeded for classification. The ensemble classifier used Adaptive Boosting (AdaBoost) technique where the output from the Machine Learning (ML) classifiers are combined to produce a single output. Adaboost combines the poor classifiers and extracts the prediction value to make a better classifier. The experimental results show that the proposed ensemble classifier provides better accuracy of 93.42 % that is comparatively better than existing Convolutional Bidirectional - Long Short-Term Memory (ConvBiLSTM) classifier and Hybrid Lexicon- Naïve Bayes Classifier (HL-NBC) which produce classification accuracy of 91.53 % and 89.61 % respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.