The Best Techniques to Deal with Unbalanced Sequential Text Data in Deep Learning

Sumarni Adi,Andi Sunyoto,Mardhiya Hayaty,Ainul Yaqin,Bety Wulan Sari,Awaliyatul Hikmah

doi:10.14569/ijacsa.2022.0131177

Abstract

Datasets with a balanced distribution of data are often difficult to find in real life. Although various methods have been developed and proven successful using shallow learning algorithms, handling unbalanced classes using a deep learning approach is still limited. Most of these studies only focus on image data using the Convolution Neural Network (CNN) architecture. In this study, we tried to apply several class handling techniques to three datasets of unbalanced text data. Both use a data-level approach with resampling techniques on word vectors and algorithm-level using Weighted Cross-Entropy Loss (WCEL) to handle cases of imbalanced text classification. With Bidirectional Long-Short Term Memory (BiLSTM) architecture. We tested each method using three datasets with different characteristics and levels of imbalance. Based on the experiments that have been carried out, each technique applied has a different performance on each dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Best Techniques to Deal with Unbalanced Sequential Text Data in Deep Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2022
License type: cc-by

Similar Papers

A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance
Hongxia Lu ... Cyril Rakovski
BMC Medical Research Methodology | VOL. 22
Hongxia Lu, et. al.Hongxia Lu ... Cyril Rakovski
02 Jul 2022
BMC Medical Research Methodology | VOL. 22

Building deep learning models for evidence classification from the open access biomedical literature.
...
F1000Research | VOL. 8
, et. al. ...
09 Apr 2019
F1000Research | VOL. 8

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

A ConvBiLSTM Deep Learning Model-Based Approach for Twitter Sentiment Classification
Sakirin Tam ... Rachid Ben Said
IEEE Access | VOL. 9
Sakirin Tam, et. al.Sakirin Tam ... Rachid Ben Said
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Best Techniques to Deal with Unbalanced Sequential Text Data in Deep Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications