Abstract

In the medical field, Named Entity Recognition (NER) plays a crucial role in the process of information extraction through electronic medical records and medical texts. To address the problems of long distance entity, entity confusion, and difficulty in boundary division in the Chinese electronic medical record NER task, we propose a Chinese electronic medical record NER method based on the multi-head attention mechanism and character-word fusion. This method uses a new character-word joint feature representation based on the pre-training model BERT and self-constructed domain dictionary, which can accurately divide the entity boundary and solve the impact of unregistered words. Subsequently, on the basis of the BiLSTM-CRF model, a multi-head attention mechanism is introduced to learn the dependency relationship between remote entities and entity information in different semantic spaces, which effectively improves the performance of the model. Experiments show that our models have better performance and achieves significant improvement compared to baselines. The specific performance is that the F1 value on the Chinese electronic medical record data set reaches 95.22%, which is 2.67%higher than the F1 value of the baseline model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call