Short text automatic scoring system based on BERT-BiLSTM model

Linzhong Xia,Jianfeng Ye,De’An Luo,Jun Liu,Xuemei Cao,Mingxiang Guan

doi:10.3724/sp.j.1249.2022.03349

Linzhong Xia, Jianfeng Ye + Show 4 more

https://doi.org/10.3724/sp.j.1249.2022.03349

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Aiming at the problems of sparse features, polysemy of one word and less context related information in short text automatic scoring, a short text automatic scoring model based on bidirectional encoder representations from transformers - bidirectional long short-term memory (BERT-BiLSTM) is proposed. Firstly, the large-scale corpus is pre-trained with bidirectional encoder representations from transformers (BERT) language model to acquire the semantic features of the general language. Then the semantic features of short text and the semantics of keywords in a specific context are acquired through the short text data for the pre-fine tuning downstream specific tasks set pre-fined by BERT. And then the deep-seated context dependency is captured through bidirectional long short-term memory (BiLSTM). Finally, the obtained feature vectors are input into Softmax regression model for automatic scoring. The experimental results show that compared with other benchmark models of convolutional neural networks(CNN), character-level CNN (CharCNN), long short-term memory (LSTM) and BERT, the short text automatic scoring model based on BERT-BiLSTM achieves the best average value of quadratic weighted kappa coefficient.

Full Text