Text classification model for methamphetamine-related tweets in Southeast Asia using dual data preprocessing techniques

Narongsak Chayangkoon,Anongnart Srivihok

doi:10.11591/ijece.v11i4.pp3617-3628

Narongsak Chayangkoon, Anongnart Srivihok

Open Access

https://doi.org/10.11591/ijece.v11i4.pp3617-3628

Copy DOI

Abstract

<span>Methamphetamine addiction is a prominent problem in Southeast Asia. Drug addicts often discuss illegal activities on popular social networking services. These individuals spread messages on social media as a means of both buying and selling drugs online. This paper proposes a model, the “text classification model of methamphetamine tweets in Southeast Asia” (TMTA), to identify whether a tweet from Southeast Asia is related to methamphetamine abuse. The research addresses the weakness of bag of words (BoW) by introducing BoW and Word2Vec feature selection (BWF) techniques. A domain-based feature selection method was performed using the BoW dataset and Word2Vec. The BWF dataset provided a smaller number of features than the BoW and TF–IDF dataset. We experimented with three candidate classifiers: Support vector machine (SVM), decision tree (J48) and naive bayes (NB). We found that the J48 classifier with the BWF dataset provided the best performance for the TMTA in terms of accuracy (0.815), F-measure (0.818), Kappa (0.528), Matthews correlation coefficient (0.529) and high area under the ROC Curve (0.763). Moreover, TMTA provided the lowest runtime (3.480 seconds) using the J48 with the BWF dataset.</span>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Electrical and Computer Engineering (IJECE)	Publication Date: Aug 1, 2021
Citations: 3	License type: CC BY-SA 4.0

R Discovery Prime

R Discovery Prime

Text classification model for methamphetamine-related tweets in Southeast Asia using dual data preprocessing techniques

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)

Lead the way for us

Similar Papers

Exploring the Effect of N-grams with BOW and TF-IDF Representations on Detecting Fake News
Amal Esmail Qasem ... Mohammad Sajid
-
Amal Esmail Qasem, et. al.Amal Esmail Qasem ... Mohammad Sajid
25 Oct 2022
25 Oct 2022

Effects of Light Stemming on Feature Extraction and Selection for Arabic Documents Classification
Yousif A Alhaj ... Mohamed Abd Elaziz
-
Yousif A Alhaj, et. al.Yousif A Alhaj ... Mohamed Abd Elaziz
30 Nov 2019
30 Nov 2019

Semantic Analysis of Urdu English Tweets Empowered by Machine Learning
Nadia Tabassum ... Umer Farooq
Intelligent Automation & Soft Computing | VOL. 29
Nadia Tabassum, et. al.Nadia Tabassum ... Umer Farooq
01 Jan 2020
Intelligent Automation & Soft Computing | VOL. 29

Sentiment analysis of mass rapid transit jakarta using naïve bayes classifier and rule-based opinion target detection on Twitter
Dhanika Jeihan Aguinta ... Putra Pandu Adikara
-
Dhanika Jeihan Aguinta, et. al.Dhanika Jeihan Aguinta ... Putra Pandu Adikara
16 Nov 2020
16 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text classification model for methamphetamine-related tweets in Southeast Asia using dual data preprocessing techniques

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)