Developing an efficient twitter sentiment analysis framework is one of the most challenging and demanding chores in current days. In the conventional works, various machine/deep learning based detection mechanisms are developed for tweet analysis. It encompasses the methodologies of data clustering, optimization, and classification, but it has the major problems of difficult to understand, complexity in mathematical computations, high time consumption, and error rate. Therefore, the proposed research work intends to implement an intelligent and advanced data mining techniques to design and develop a sentiment analysis framework. At first, the preprocessing is performed to normalize the dataset for generating the noise free balanced dataset. After that, the Hierarchical Tweet Expression Clustering (HTEC) algorithm is used to cluster the attributes according to the distance for simplifying the process of classification. Moreover, the multi-faceted features such as vector space Bag-of-Words (BOW), and Term Frequency – Inverse Document Frequency (IDM) are extracted for classifier training and testing operations. Finally, the Spatial Dense Bi-directional Long Short Memory (SDBi-LSTM) classification methodology is used for predicting the sentiment as positive, negative, or neutral. For evaluation, the Stanford Twitter Sentiment Test Set (STS-Test/Sentiment140), GOP debate and IMDB datasets are used to validate the results of the proposed system by using the parameters of accuracy, f1-score, training time, testing time, true positive rate, false positive rate, and etc. Also, the obtained results are compared with the recent prediction models for proving the superiority of the proposed framework.
Read full abstract