Abstract
Sentiment Analysis is a technique that is being used abundantly nowadays for customer reviews analysis, popularity analysis of electoral candidates, hate speech detection and similar applications. Sentiment analysis on tweets encounters challenges such as highly skewed classes, high dimensional feature vectors and highly sparse data. In this study, we have analyzed the improvement achieved by successively addressing these problems in order to determine their severity for sentiment analysis of tweets. Firstly, we prepared a comprehensive data set consisting of Urdu Tweets for sentiment analysis-based hate speech detection. To improve the performance of the sentiment classifier, we employed dynamic stop words filtering, Variable Global Feature Selection Scheme (VGFSS) and Synthetic Minority Optimization Technique (SMOTE) to handle the sparsity, dimensionality and class imbalance problems respectively. We used two machine learning algorithms i.e., Support Vector Machines (SVM) and Multinomial Naïve Bayes' (MNB) for investigating performance in our experiments. Our results show that addressing class skew along with alleviating the high dimensionality problem brings about the maximum improvement in the overall performance of the sentiment analysis-based hate speech detection.
Highlights
S ENTIMENT analysis is one of the trending topics of research regarding Natural Language Processing and text classification
METHODOLOGY we have discussed the methodology on how we have built the corpus for this study and the techniques we used to improve the performance of sentiment analysis-based hate speech classifier for Urdu
PERFORMANCE EVALUATION To evaluate the effectiveness of the proposed solutions to the class imbalance, sparsity and dimensionality problems and to compare the results, we used micro F1 measure and 5-fold cross validation
Summary
S ENTIMENT analysis is one of the trending topics of research regarding Natural Language Processing and text classification Using this technique, one is able to extract the semantic sense out of a given word, sentence or a document and being widely used in various areas of life from product reviews analysis to probing the popularity of candidates contesting in the elections. The aspect level sentiment analysis is used to fine grain the individual sentences to check for the semantic orientation of a particular entity in a sentence. This type of sentiment analysis may result in multiple entities and multiple sentiments in the same sentence [2]. Apart from these three types of sentiment analysis, conversational sentiment analysis has been introduced recently which is different from sentence-level sentiment analysis in a way that it captures context information in dialogues as well. [3]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.