Alternative Non-BERT Model Choices for the Textual Classification in Low-Resource Languages and Environments

Syed Mustavi Maheen

doi:10.48448/9h9f-y066

Abstract

Natural Language Processing (NLP) tasks in non-dominant and low-resource languages have not experienced significant progress. Although pre-trained BERT models are available, GPU-dependency, large memory requirement, and data scarcity often limit their applicability. As a solution, this paper proposes a fusion chain architecture comprised of one or more layers of CNN, LSTM, and BiLSTM and identifies precise configuration and chain length. The study shows that a simpler, CPU-trainable non-BERT fusion CNN + BiLSTM + CNN is sufficient to surpass the textual classification performance of the BERT-related models in resource-limited languages and environments. The fusion architecture competitively approaches the state-of-the-art accuracy in several Bengali NLP tasks and a six-class emotion detection task for a newly developed Bengali dataset. Interestingly, the performance of the identified fusion model, for instance, CNN+ BiLSTM + CNN, also holds for other low-resource languages and environments. Efficacy study shows that the CNN + BiLSTM+ CNN model outperforms BERT implementation for Vietnamese languages and performs almost equally in English NLP tasks experiencing artificial data scarcity. For the GLUE benchmark and other datasets such as Emotion, IMDB, and Intent classification, the CNN + BiLSTM + CNN model often surpasses or competes with BERT-base, TinyBERT, DistilBERT, and mBERT. Besides, a position-sensitive self-attention layer role further improves the fusion models’ performance in the Bengali emotion classification. The models are also compressible to as low as ≈ 5× smaller through pruning and retraining, making them more viable for resource-constrained environments. Together, this study may help NLP practitioners and serve as a blueprint for NLP model choices in textual classification for low-resource languages and environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Alternative Non-BERT Model Choices for the Textual Classification in Low-Resource Languages and Environments

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

CNO-LSTM: A Chaotic Neural Oscillatory Long Short-Term Memory Model for Text Classification
Nuobei Shi ... Zhuohui Chen
IEEE Access | VOL. 10
Nuobei Shi, et. al.Nuobei Shi ... Zhuohui Chen
01 Jan 2021
IEEE Access | VOL. 10

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
Mounika Marreddy ... Radhika Mamidi
-
Mounika Marreddy, et. al.Mounika Marreddy ... Radhika Mamidi
18 Jul 2022
18 Jul 2022

A Survey of Adversarial Defenses and Robustness in NLP
Shreya Goyal ... Mitesh M Khapra
ACM Computing Surveys | VOL. 55
Shreya Goyal, et. al.Shreya Goyal ... Mitesh M Khapra
17 Jul 2023
ACM Computing Surveys | VOL. 55

Rationalization for explainable NLP: a survey.
Sai Gurrapu ... Lifu Huang
Frontiers in Artificial Intelligence | VOL. 6
Sai Gurrapu, et. al.Sai Gurrapu ... Lifu Huang
25 Sep 2023
Frontiers in Artificial Intelligence | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Alternative Non-BERT Model Choices for the Textual Classification in Low-Resource Languages and Environments

Abstract

Talk to us

Similar Papers