Text data augmentations: Permutation, antonyms and negation

Giannis Haralabopoulos,Mercedes Torres Torres,Ioannis Anagnostopoulos,Derek Mcauley

doi:10.1016/j.eswa.2021.114769

Giannis Haralabopoulos, Mercedes Torres Torres + Show 2 more

Open Access

https://doi.org/10.1016/j.eswa.2021.114769

Copy DOI

Journal: Expert Systems with Applications	Publication Date: Mar 11, 2021
Citations: 24	License type: other-oa

Abstract

Text has traditionally been used to train automated classifiers for a multitude of purposes, such as: classification, topic modelling and sentiment analysis. State-of-the-art LSTM classifier require a large number of training examples to avoid biases and successfully generalise. Labelled data greatly improves classification results, but not all modern datasets include large numbers of labelled examples. Labelling is a complex task that can be expensive, time-consuming, and potentially introduces biases. Data augmentation methods create synthetic data based on existing labelled examples, with the goal of improving classification results. These methods have been successfully used in image classification tasks and recent research has extended them to text classification. We propose a method that uses sentence permutations to augment an initial dataset, while retaining key statistical properties of the dataset. We evaluate our method with eight different datasets and a baseline Deep Learning process. This permutation method significantly improves classification accuracy by an average of 4.1%. We also propose two more text augmentations that reverse the classification of each augmented example, antonym and negation. We test these two augmentations in three eligible datasets, and the results suggest an -averaged, across all datasets-improvement in classification accuracy of 0.35% for antonym and 0.4% for negation, when compared to our proposed permutation augmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text data augmentations: Permutation, antonyms and negation

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Similar Papers

Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages
Shyam Upadhyay ... Jordan Kodner
-
Shyam Upadhyay, et. al.Shyam Upadhyay ... Jordan Kodner
01 Jan 2018
01 Jan 2018

Human action recognition using graph matching
Ashwan A. Abdulmunem ... Yu-Kun Lai
-
Ashwan A. Abdulmunem, et. al.Ashwan A. Abdulmunem ... Yu-Kun Lai
01 Jan 2019
01 Jan 2019

Towards Zero-Shot Learning with Fewer Seen Class Examples
Vinay Kumar Verma ... Ashish Mishra
-
Vinay Kumar Verma, et. al.Vinay Kumar Verma ... Ashish Mishra
01 Jan 2020
01 Jan 2020

Iterative kernel principal component analysis for image modeling
Kwa ... B Scholkopf
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 27
Kwa, et. al. Kwa ... B Scholkopf
01 Sep 2005
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text data augmentations: Permutation, antonyms and negation

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications