Abstract

Clinic Named Entity Recognition (CNER) aims to recognize named entities such as body part, disease and symptom from Electronic Health Records (EHRs), which can benefit many intelligent biomedical systems. In recent years, more and more attention has been paid to the end-to-end CNER with recurrent neural networks (RNNs), especially for long short-term memory networks (LSTMs). However, it remains a great challenge for RNNs to capture long range dependencies. Moreover, Chinese presents additional challenges, since it uses logograms instead of alphabets, the ambiguities of Chinese word and has no word boundaries. In this work, we present a BiLSTM-CRF with self-attention mechanism (Att-BiLSTM-CRF) model for Chinese CNER task, which aims to address these problems. Self-attention mechanism can learn long range dependencies by establishing a direct connection between each character. In order to learn more semantic information about Chinese characters, we propose a novel fine-grained character-level representation method. We also introduce part-of-speech (POS) labeling information about our model to capture the semantic information in input sentence. We conduct the experiment by using CCKS-2017 Shared Task 2 dataset to evaluate performance, and the experimental results indicated that our model outperforms other state-of-the-art methods.

Highlights

  • Clinic Named Entity Recognition (CNER) is a basic and important natural language processing (NLP) task in the clinical and digital health research

  • In order to learn more semantic information of Chinese characters, we propose a novel fine-grained character-level representation method

  • Contributions of this paper can be summarized as follows: 1) We proposed an Att-Bi-directional Long Short-Term Memory (BiLSTM)-CRF model to perform the Chinese CNER task

Read more

Summary

INTRODUCTION

Clinic Named Entity Recognition (CNER) is a basic and important natural language processing (NLP) task in the clinical and digital health research. To address the above problems, Hu et al [24] utilize an ensemble method which consists of rules, CRF and LSTM based models for the Chinese CNER task They added training data by using a self-training algorithm to improve the performance. Xia and Wang [31] proposed a BiLSTM-CRF model with self-training and ensemble learning algorithm to address the Chinese CNER task. We utilize a novel fine-grained character-level representation method to obtain more semantic information of Chinese characters, and we introduce the POS labeling information into our model to learn semantic information of input sentences. The structure of our model was described

EMBEDDING LAYER
BILSTM LAYER
SELF-ATTENTION LAYER
CRF LAYER
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.