Abstract
Appropriate discipline and decorum maintenance is the essential requirement on the online platform with the growth of technology. Many uncivilized threat and degrading messages are sent by the unauthorized users in the electronic mails and mobile messages. The process of exact identification and removal of toxic messages from the social media is critical and still remains a challenging issue. With this background, this paper provides the following contribution towards predicting the toxic comment classification. Firstly, the Jigsaw toxic comment classification dataset extracted from KAGGLE repository is subjected with the data pre-processing. Secondly, the distribution of the toxic behaviour classes is identified and depicted as pictorial representations. Thirdly, TFIDF is used for segregating the most likely features of the dataset with the methodology of both unigrams and bigrams. Fourth, the removed Counter vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. Fifth, the extracted TFID vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. Sixth, the extracted Hash vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. The efficiency analysis is done with the use of evaluation metrics of the classification. Experimental results shows that Kernel SVM classifier have the 87% accuracy with the binary equivalence classification for all the tokenization methods. Random Forest classifier have the accuracy of 88% with the classifier chain and 87% with label power set classification for all the tokenization methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.