Twitter, the largest microblogging platform, has reported more than 330 million active users in recent years. Many users express their sentiments about politics, sports, products, personalities, etc. Sentiment analysis has emerged as a specialized branch of machine learning in which tweets are binary-classified to provide sentimental insights. A major step in sentiment classification is feature selection, which primarily revolves around parts of speech (POS). Few techniques merely focused on single features such as adjectives, adverbs, and verbs, while other techniques examined types of these features, such as comparative adjectives, superlative adjectives, or general adverbs. Furthermore, POS as linguistic entities have also been studied and extensively classified by researchers, such as CLAWS-C7. For sentiment analysis, none of the studies conceptualized all possible POS features under similar conditions to draw firm conclusion. This research is centered on the following objectives: 1) examining the impact of various types of adjectives and adverbs that have not been previously explored for sentiment classification; 2) analyzing potential combinations of adjectives and adverbs types 3) conducting a comparison with a benchmark dataset for better classification accuracy. To assess the concept, a renowned human annotated dataset of tweets is investigated. Results showed that classification accuracy for adjectives is improved up to 83% based on the general superlative adjective whereas for adverbs, comparative general adverb also depicted significant accuracy improvement. Their combination with general adjectives and general adverbs also played a substantial role. The unexplored potential of adjectives and adverb types proved better in accuracy against state-of-the-art probabilistic model. In comparison to lexicon-based model, proposed research model overruled the dependency of lexicon-based dictionary where each term first needs to be matched for semantic orientation. The evident outcomes also help in time reduction aspect where huge volume of data need to be processed swiftly. This noteworthy contribution brought up significant knowledge and direction for domain experts. In the future, the proposed technique will be explored for other types of textual data across different domains.
Read full abstract