Abstract

Datasets with a balanced distribution of data are often difficult to find in real life. Although various methods have been developed and proven successful using shallow learning algorithms, handling unbalanced classes using a deep learning approach is still limited. Most of these studies only focus on image data using the Convolution Neural Network (CNN) architecture. In this study, we tried to apply several class handling techniques to three datasets of unbalanced text data. Both use a data-level approach with resampling techniques on word vectors and algorithm-level using Weighted Cross-Entropy Loss (WCEL) to handle cases of imbalanced text classification. With Bidirectional Long-Short Term Memory (BiLSTM) architecture. We tested each method using three datasets with different characteristics and levels of imbalance. Based on the experiments that have been carried out, each technique applied has a different performance on each dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call