Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

Shanshan Yu,Jindian Su,Da Luo

doi:10.1109/access.2019.2953990

Abstract

General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.

Highlights

Text classification has been a classic task and heated research hotspot in the field of natural language processing (NLP), aiming to assign pre-defined categories to a given text sequence
Using general language model BERT pre-trained on large scale of unlabeled corpus and fine-tune it in the downstream tasks has achieves many state-of-the-art results in multiple NLP tasks
We propose a BERT-based text classification model by constructing auxiliary sentence to turn the task into a sentence-pair one, aiming to incorporate more task-specific knowledge and address task-awareness challenge

Summary

INTRODUCTION

Text classification has been a classic task and heated research hotspot in the field of natural language processing (NLP), aiming to assign pre-defined categories to a given text sequence. To address the polysemous challenge, some scholars propose the notion of contextualized word vectors, i.e. CoVe [10] and ELMo (Embeddings from Language Models) [11], to learned multiple vectors for a word according to its different appearing contexts. ELMo uses vectors derived from a bidirectional LSTM that is trained with a coupled language model (LM) objective on a large text corpus and integrates these contextual word vectors with existing taskspecific supervised neural architectures Both CoVe and ELMo successfully generalize traditional word vectors to contain context-sensitive features, but these learned representations are still typically used as features in a downstream model [12]. In the past two years, in order to avoid heavily-engineered task-specific structures and greatly decrease the parameters to be learned from scratch, some scholars contributed along another direction by pre-training general language model on large-scale unsupervised text corpus and fine-tuning it for the downstream tasks.

RELATED WORK

BERT4TC

BERT ENCODER

OUTPUT LAYER

EXPERIMENTS

EXPERIMENT 1

EXPERIMENT 2

EXPERIMENT 3

EXPERIMENT 4

EXPERIMENT 5

EXPERIMENT 6

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 70	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Optimizing Large Language Models on Multi-Core CPUs: A Case Study of the BERT Model
Lanxin Zhao ... Wanrong Gao
Applied Sciences | VOL. 14
Lanxin Zhao, et. al.Lanxin Zhao ... Wanrong Gao
11 Mar 2024
Applied Sciences | VOL. 14

Cyberbullying Identification System Based Deep Learning Algorithms
Theyazn H H Aldhyani ... Saleh Nagi Alsubari
Electronics | VOL. 11
Theyazn H H Aldhyani, et. al.Theyazn H H Aldhyani ... Saleh Nagi Alsubari
12 Oct 2022
Electronics | VOL. 11

Extreme Gradient Boosting Algorithm to Improve Machine Learning Model Performance on Multiclass Imbalanced Dataset
Yoga Pristyanto ... Anggit Ferdita Nugraha
JOIV : International Journal on Informatics Visualization | VOL. 7
Yoga Pristyanto, et. al.Yoga Pristyanto ... Anggit Ferdita Nugraha
10 Sep 2023
JOIV : International Journal on Informatics Visualization | VOL. 7

Hate Speech and Reality Check Analysis of Disaster Tweets using BERT Deep Learning Model
B V Pranay Kumar, Manchala Sadanandam
Mathematical Statistician and Engineering Applications | VOL. 71
B V Pranay Kumar, Manchala SadanandamB V Pranay Kumar, Manchala Sadanandam
06 Mar 2022
Mathematical Statistician and Engineering Applications | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access