Abstract

Recent advances in natural language representation have enabled the internal state of an upstream trained model to migrate to downstream tasks such as named entity recognition (NER). To better utilize pretrained models to perform NER tasks, the latest approach implements NER using the machine reading comprehension (MRC) framework. However, existing MRC approaches do not consider the limited performance of reading comprehension models due to the absence of contextual information in a single sample. Moreover, only word-level features are employed in the feature extraction phase in existing approaches. In this paper, a novel MRC model named GFMRC is proposed to realize NER. GRMRC enhances the MRC model with contextual information and hybrid features. In the preprocessing stage, the samples of the initial MRC dataset are spliced with N-gram information. In the feature extraction stage, global features are extracted for each token using a CNN, and local features are extracted using LSTM. Experiments are carried out on both Chinese datasets and English datasets, and the results demonstrated the effectiveness of the proposed model. The improvements on the English CoNLL 2003, English OntoNotes 5.0, Chinese MSRA, and Chinese OntoNotes 4.0 datasets are 0.07%, 0.23%, 0.04%, and 0.26%, respectively, compared to BERT-MRC+DSC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call