Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

Viera Maslej-Krešňáková,Júlia Jacková,Martin Sarnovský

doi:10.3390/fi14090260

Viera Maslej-Krešňáková, Júlia Jacková + Show 1 more

Open Access

https://doi.org/10.3390/fi14090260

Copy DOI

Journal: Future Internet	Publication Date: Aug 31, 2022
Citations: 11	License type: CC BY 4.0

Affiliation: Technical University of Košice

Abstract

The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems related to imbalanced classes. Such techniques are used to generate artificial data samples used to improve the volume of the training set or to balance the target distribution. In the antisocial behavior detection domain, we frequently face both issues, the lack of quality labeled data as well as class imbalance. As the majority of the data in this domain is textual, we must consider augmentation methods suitable for NLP tasks. Easy data augmentation (EDA) represents a group of such methods utilizing simple text transformations to create the new, artificial samples. Our main motivation is to explore EDA techniques’ usability on the selected tasks from the antisocial behavior detection domain. We focus on the class imbalance problem and apply EDA techniques to two problems: fake news and toxic comments classification. In both cases, we train the convolutional neural networks classifier and compare its performance on the original and EDA-extended datasets. EDA techniques prove to be very task-dependent, with certain limitations resulting from the data they are applied on. The model’s performance on the extended toxic comments dataset did improve only marginally, gaining only 0.01 improvement in the F1 metric when applying only a subset of EDA methods. EDA techniques in this case were not suitable enough to handle texts written in more informal language. On the other hand, on the fake news dataset, the performance was improved more significantly, boosting the F1 score by 0.1. Improvement was most significant in the prediction of the minor class, where F1 improved from 0.67 to 0.86.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

Abstract

Talk to us

Similar Papers

More From: Future Internet

Lead the way for us

Similar Papers

Preliminary Results on the Generation of Artificial Handwriting Data Using a Decomposition-Recombination Strategy
Jose Fernando Adran Otero ... Zhe Sun
-
Jose Fernando Adran Otero, et. al.Jose Fernando Adran Otero ... Zhe Sun
23 May 2022
23 May 2022

Fake News and Imbalanced Data Perspective
Isha Y Agarwal ... Dipti P Rana
-
Isha Y Agarwal, et. al.Isha Y Agarwal ... Dipti P Rana
01 Jan 2020
01 Jan 2020

On the use of text augmentation for stance and fake news detection
Ilhem Salah ... Ouajdi Korbaa
Journal of Information and Telecommunication | VOL. 7
Ilhem Salah, et. al.Ilhem Salah ... Ouajdi Korbaa
19 Apr 2023
Journal of Information and Telecommunication | VOL. 7

Augmented Time Regularized Generative Adversarial Network (ATR-GAN) for Data Augmentation in Online Process Anomaly Detection
Yuxuan Li ... Zhangyue Shi
IEEE Transactions on Automation Science and Engineering | VOL. 19
Yuxuan Li, et. al.Yuxuan Li ... Zhangyue Shi
01 Oct 2022
IEEE Transactions on Automation Science and Engineering | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

Abstract

Talk to us

Similar Papers

More From: Future Internet