Hate Speech Detection Using NLP and Machine Learning

Prof Humera Khanam M,Md A Khudhus,Sasanka Boothati

doi:10.22214/ijraset.2024.58962

Abstract

Abstract: The use of social media has been growing in an eccentric fashion making it a medium for sharing opinions, ideas, and thoughts of an individual with others. This has made things complex with what is considered a genuine comment or rather a hypocritic deliberative nuance to damage or incite hatred on an individual or a group belonging to a community, race, gender, nationality, etc, In this paper, the detection of hate speech with the use of sentiment polarity scores and the Term Frequency Inverse Document Frequency(TFIDF) scores with machine learning algorithms is to decrease the true negatives and false positives by the use of Natural Language Processing. The Machine Learning algorithms used are Logistic Regression and Random Forest Classifier. The phases of NLP are done to preprocess the tweets that are available on the Kaggle with about 25 thousand tweets from the social media giant “Twitter”. The processed tweets are then with the use of two ML Algorithms trained for vaderSentiment polarity scores and TFIDF scores from which metrics are obtained. The results of sentiment polarity scores(7 points) are less accurate in the detection of hate speech as compared to TFIDF scores(8 points).

Full Text