A Chinese NER Method Based on Chinese Characters' Multiple Information

Daling Li,Haojun Zhang,Jiahui Wang,Shilong Li

doi:10.1109/eebda56825.2023.10090838

Abstract

A Chinese named entity recognition method that integrates multiple information is proposed for the inadequate prior knowledge of word vectors in the pre-trained model-based Chinese named entity recognition method. Based on the existing BERT model, the feature representation of Chinese character vectors is enhanced by constructing a table of bias and word frequency vectors of all Chinese characters in the dataset, embedding them into an improved word fusion model for encoding, and finally, feeding them into CRF for decoding. The experimental results show that the proposed method has improved F1 values over the public datasets MSRA, Weibo and Resume while maintaining fast decoding speed and low computer resource consumption, and performs better in terms of robustness and generalization ability than models such as BiLSTM-CRF and Lattice-LSTM.

Full Text