WITHDRAWN: Equivalence classifier chain Label power set tokenization of toxic comment multi label classification using machine learning

Munisamy Shyamala Devi,M Sumithra,Saranya Vivekanandan,P Naveen Kumar

doi:10.1016/j.matpr.2021.02.772

Abstract

Appropriate discipline and decorum maintenance is the essential requirement on the online platform with the growth of technology. Many uncivilized threat and degrading messages are sent by the unauthorized users in the electronic mails and mobile messages. The process of exact identification and removal of toxic messages from the social media is critical and still remains a challenging issue. With this background, this paper provides the following contribution towards predicting the toxic comment classification. Firstly, the Jigsaw toxic comment classification dataset extracted from KAGGLE repository is subjected with the data pre-processing. Secondly, the distribution of the toxic behaviour classes is identified and depicted as pictorial representations. Thirdly, TFIDF is used for segregating the most likely features of the dataset with the methodology of both unigrams and bigrams. Fourth, the removed Counter vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. Fifth, the extracted TFID vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. Sixth, the extracted Hash vectorized features from the toxic comment classification dataset is fitted to various classifiers and the efficiency is analysed for binary equivalence, Classifier chain and Label power setclassification. The efficiency analysis is done with the use of evaluation metrics of the classification. Experimental results shows that Kernel SVM classifier have the 87% accuracy with the binary equivalence classification for all the tokenization methods. Random Forest classifier have the accuracy of 88% with the classifier chain and 87% with label power set classification for all the tokenization methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

WITHDRAWN: Equivalence classifier chain Label power set tokenization of toxic comment multi label classification using machine learning

Abstract

Talk to us

Similar Papers

More From: Materials Today: Proceedings

Lead the way for us

Journal: Materials Today: Proceedings	Publication Date: Mar 1, 2021
Citations: 1

Similar Papers

Harnessing Multi-label Classification Approaches for Economic Phenomena Categorization
Nofriani ... Novianto Budi Kurniawan
ASEAN Journal on Science and Technology for Development | VOL. 38
Nofriani, et. al. Nofriani ... Novianto Budi Kurniawan
31 Aug 2021
ASEAN Journal on Science and Technology for Development | VOL. 38

Multi-label Classification for Hate Speech and Abusive Language in Indonesian-Local Languages
Ajeng Dwi Asti ... Indra Budi
-
Ajeng Dwi Asti, et. al.Ajeng Dwi Asti ... Indra Budi
23 Oct 2021
23 Oct 2021

Aspect-Based Sentiment Analysis and Emotion Detection for Code-Mixed Review
Andi Suciati ... Indra Budi
International Journal of Advanced Computer Science and Applications | VOL. 11
Andi Suciati, et. al.Andi Suciati ... Indra Budi
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

What Users Want for Gig Economy Platforms: Sentiment Analysis Approach
Nadina Adelia Indrawan ... Arfive Gandhi
-
Nadina Adelia Indrawan, et. al.Nadina Adelia Indrawan ... Arfive Gandhi
21 Oct 2020
21 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WITHDRAWN: Equivalence classifier chain Label power set tokenization of toxic comment multi label classification using machine learning

Abstract

Talk to us

Similar Papers

More From: Materials Today: Proceedings