Abstract

AbstractData mining implies the application of techniques of obtaining useful knowledge from a huge data. Another term for data mining is knowledge discovery from data. For the same, various data mining technologies are available such as statistics (lay the foundation of data mining), artificial intelligence (applying human thoughts like processing of data), and machine learning (union of statistics and artificial intelligence). In this research work, authors employ natural language processing in order to perform sentiment analysis using various feature extraction techniques of NLP. Sentiment analysis is especially important to gain users’ feedback and opinion about products. In this paper, authors perform sentiment analysis of twitter data. Each data point (tweets in considered case) will be classified as “positive tweet” or “negative tweet”. For this classification, six different techniques, i.e., information gain, Gini index (GI), Naive Bayes, K-nearest neighbor, random forest, and gradient boost are used. In the end, classification through all these techniques are analyzed and a comparative analysis is made based upon accuracy, precision, recall, and F1-score. Experimental results suggest that random forest aces the current analysis by yielding an accuracy of 97%.KeywordsSentiment analysisClassificationData miningNaïve BayesRandom forestGradient boost

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call