Abstract

Compared to English, Chinese named entity recognition has lower performance due to the greater ambiguity in entity boundaries in Chinese text, making boundary prediction more difficult. While traditional models have attempted to enhance the definition of Chinese entity boundaries by incorporating external features such as lexicons or glyphs, they have rarely disentangled the entity boundary prediction problem for separate study. In order to leverage entity boundary information, the named entity recognition task has been decomposed into two subtasks: boundary annotation and type annotation, and a multi-task learning network (MTL-BERT) has been proposed that combines a bidirectional encoder (BERT) model. This network performs joint encoding and specific decoding of the subtasks, enhancing the model’s feature extraction abilities by reinforcing the feature associations between subtasks. Multiple sets of experiments conducted on Weibo NER, MSRA, and OntoNote4.0 public datasets show that the F1 values of MTL-BERT reach 73.8%, 96.5%, and 86.7%, respectively, effectively improving the performance and efficiency of Chinese named entity recognition tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call