Abstract
Offensive language detection is the technique of identifying and detecting user-generated offensive comments such as insults, pain, profanity, and racism that are targeted at a specific individual or group on social media. As social media platforms become more prominent, offensive language is used more frequently, becoming a major challenge in modern society. A novel effective offensive language classification (EOLC) technique has been proposed to overcome these challenges. English language tweets from YouTube and X (Twitter) with offensive, mild, swear, and non-offensive tweets are used in this paper. Initially, the tweets and comments are pre-processed, and the features are extracted using different techniques, namely term frequency-inverse document frequency (TF-IDF), WordVec, and lexicon-based features. The extracted features are classified using the graph-based deep learning (GDL) method for numerical representation and decision-making. GDL network is optimized with red fox optimization (RFO) to normalize the weight and biases of the network and achieve better accuracy. The proposed GDL model achieves the highest levels of classification accuracy on the X (Twitter) and YouTube datasets, with 95.5 % and 96.8 %, respectively. The results obtained from GDL are more accurate and of higher quality than those obtained from traditional classifiers. The proposed EOLC method improves the overall accuracy by 5.56 %, 7.4 %, 7.7 %, and 10.2 % better than Text CNN, CNN-LSTM, DRNN, and LogitBoost, respectively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have