Abstract

Named entity recognition is a fundamental and crucial task in medical natural language processing problems. In medical fields, Chinese clinical named entity recognition identifies boundaries and types of medical entities from unstructured text such as electronic medical records. Recently, a composition model of bidirectional Long Short-term Memory Networks (BiLSTMs) and conditional random field (BiLSTM-CRF) based character-level semantics has achieved great success in Chinese clinical named entity recognition tasks. But this method can only capture contextual semantics between characters in sentences. However, Chinese characters are hieroglyphics, and deeper semantic information is hidden inside, the BiLSTM-CRF model failed to get this information. In addition, some of the entities in the sentence are dependent, but the Long Short-term Memory (LSTM) does not capture long-term dependencies perfectly between characters. So we propose a BiLSTM-CRF model based on the radical-level feature and self-attention mechanism to solve these problems. We use the convolutional neural network (CNN) to extract radical-level features, aims to capture the intrinsic and internal relevances of characters. In addition, we use self-attention mechanism to capture the dependency between characters regardless of their distance. Experiments show that our model achieves F1-score 93.00% and 86.34% on CCKS-2017 and TP_CNER dataset respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call