Abstract

Named entity recognition (NER) is one of the basic tasks of knowledge extraction. In view of corpus in some specific fields have sparse semantics and limited text standardization, the existing entity recognition methods have the problems of wrong potential word interference, matching potential words in some specific fields difficultly, and the key information of entities with different lengths in sentences may have interference in the representation of attention mechanism. This paper proposes a NER model integrating word and improved entity key information attention mechanism, which is called binocular attention-based BiLSTM with CNN network (BACBN). Firstly, the character feature embedding is generated through bidirectional encoder representation of transformers (BERT), and a word-level character feature attention mechanism is proposed to highlight the character features constituting words. Secondly, based on the bidirectional long short-term memory network (BiLSTM), an n-gram pooling feature attention mechanism is proposed. The prominent features corresponding to different convolution kernel sizes are obtained through convolution neural network (CNN), and the context features are weighted according to the prominent features, so as to obtain the information that contributes more to entity recognition with different lengths. The experimental results on four specific field Chinese corpus NER datasets show that BACBN model can improve the result of entity recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call