Improving Chinese Clinical Named Entity Recognition Based on BiLSTM-CRF by Cross-Domain Transfer

Kunli Zhang,Lei Zhuang,Donghui Yue

doi:10.1145/3409501.3409527

Abstract

Named entity recognition (NER) serves as an essential resource in natural language processing (NLP) applications. Most existing named entity recognition models mainly focus on social media, biomedicine and finance. However, the number of researches on Chinese Electronic Medical Records (EMRs) is limited, which fails to model effective medical entities. In this paper, a novel cross-domain transfer model, namely T-BiLSTM-CRF, is proposed for Chinese clinical named entity recognition (e.g., disease, symptom, drug, anatomy). Considering the scarceness of large-scale datasets, we build a hybrid model to fully extract the discriminating information and then integrate it with the target feature to achieve optimal solution. Our proposed method can encode features from different sources to better express various entities and enhance the recognition performance. Extensive experiments were carried out on benchmark CCKS 2018 datasets. Results demonstrate the superiority of the proposed T-BiLSTM-CRF comparing with several representative methods.

Full Text