Abstract

BackgroundClinical entity recognition as a fundamental task of clinical text processing has been attracted a great deal of attention during the last decade. However, most studies focus on clinical text in English rather than other languages. Recently, a few researchers have began to study entity recognition in Chinese clinical text.MethodsIn this paper, a novel deep neural network, called attention-based CNN-LSTM-CRF, is proposed to recognize entities in Chinese clinical text. Attention-based CNN-LSTM-CRF is an extension of LSTM-CRF by introducing a CNN (convolutional neural network) layer after the input layer to capture local context information of words of interest and an attention layer before the CRF layer to select relevant words in the same sentence.ResultsIn order to evaluate the proposed method, we compare it with other two currently popular methods, CRF (conditional random field) and LSTM-CRF, on two benchmark datasets. One of the datasets is publically available and only contains contiguous clinical entities, and the other one is constructed by us and contains contiguous and discontiguous clinical entities. Experimental results show that attention-based CNN-LSTM-CRF outperforms CRF and LSTM-CRF.ConclusionsCNN and attention mechanism are individually beneficial to LSTM-CRF-based Chinese clinical entity recognition system, no matter whether contiguous clinical entities are considered. The conribution of attention mechanism is greater than CNN.

Highlights

  • With rapid development of electronic medical information systems, more and more electronic medical records (EMRs) are available for medical research and application

  • We propose a novel deep neural network, called attention-based Convolutional neural network (CNN)-LSTM-conditional random field (CRF), for entity recognition considering both contiguous and discontiguous entities in Chinese clinical text

  • It consists of the following five layers: 1) Input layer, which takes the representation of each Chinese character in a sentence; 2) CNN layer, which represents the local context of a Chinese character of interest within a sliding window (e.g. [− 1, 1] in Fig. 1); 3) LSTM layer, which uses a forward LSTM and a backward LSTM to model a sentence to capture global context information of a sentence; 4) Attention layer, which determines relativity strength of other Chinese characters to a Chinese character of interest; 5) CRF layer, which predicts a label sequence for an input sentence by considering relations between neighbor labels

Read more

Summary

Introduction

With rapid development of electronic medical information systems, more and more electronic medical records (EMRs) are available for medical research and application. In EMRs, plenty of useful information is embedded in clinical text. A large number of methods have been proposed for clinical entity recognition. To promote development of entity recognition in Chinese clinical text, the organizers of China conference on knowledge graph and semantic computing (CCKS) launched a challenge was launched in 2017 [3]. The challenge organizer provided a dataset (called CCKS2017_CNER) with only contiguous clinical entities following the guideline of i2b2 Most studies focus on clinical text in English rather than other languages. A few researchers have began to study entity recognition in Chinese clinical text

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.