Sequence Labeling of Chinese Text Based on Bidirectional Gru-Cnn-Crf Model

Di Liu,Xinyi Zou

doi:10.1109/iccwamtip.2018.8632570

Di Liu, Xinyi Zou

https://doi.org/10.1109/iccwamtip.2018.8632570

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Sequence labeling is the basis for many tasks in natural language processing (NLP). It plays an important role in tasks such as word segmentation, named entity recognition (NER), and part-of-speech (POS)tagging. The current mainstream method for sequence labeling is to combine neural network with conditional random field (CRF). The common model is usually a bidirectional RNN-CRF model, which can solve the problem that the labeling task with traditional method cannot be combined well with the context. This paper proposes a Chinese sequence labeling model based on bidirectional GRU-CNN-CRF, which can pay more attention to local features and context relationships, and has better performance in word segmentation and NER. This paper takes the corpus provided by Chinese Wikipedia as the training data set and preprocesses the text by word embedding. The data are then processed through a three-tier architecture of bidirectional Gated Recurrent Unit (GRU), Convolution Neural Network (CNN)and CRF, and finally complete the task of sequence annotation. Compared with the traditional Chinese word segmentation system, this method is more accurate. And it performs better than bidirectional GRU-CRF model on NER issues.

Full Text