Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF

Buzhou Tang,Xiaolong Wang,Qingcai Chen,Jun Yan

doi:10.1186/s12911-019-0787-y

Buzhou Tang, Xiaolong Wang + Show 2 more

Open Access

https://doi.org/10.1186/s12911-019-0787-y

Copy DOI

Abstract

BackgroundClinical entity recognition as a fundamental task of clinical text processing has been attracted a great deal of attention during the last decade. However, most studies focus on clinical text in English rather than other languages. Recently, a few researchers have began to study entity recognition in Chinese clinical text.MethodsIn this paper, a novel deep neural network, called attention-based CNN-LSTM-CRF, is proposed to recognize entities in Chinese clinical text. Attention-based CNN-LSTM-CRF is an extension of LSTM-CRF by introducing a CNN (convolutional neural network) layer after the input layer to capture local context information of words of interest and an attention layer before the CRF layer to select relevant words in the same sentence.ResultsIn order to evaluate the proposed method, we compare it with other two currently popular methods, CRF (conditional random field) and LSTM-CRF, on two benchmark datasets. One of the datasets is publically available and only contains contiguous clinical entities, and the other one is constructed by us and contains contiguous and discontiguous clinical entities. Experimental results show that attention-based CNN-LSTM-CRF outperforms CRF and LSTM-CRF.ConclusionsCNN and attention mechanism are individually beneficial to LSTM-CRF-based Chinese clinical entity recognition system, no matter whether contiguous clinical entities are considered. The conribution of attention mechanism is greater than CNN.

Highlights

With rapid development of electronic medical information systems, more and more electronic medical records (EMRs) are available for medical research and application
We propose a novel deep neural network, called attention-based Convolutional neural network (CNN)-LSTM-conditional random field (CRF), for entity recognition considering both contiguous and discontiguous entities in Chinese clinical text
It consists of the following five layers: 1) Input layer, which takes the representation of each Chinese character in a sentence; 2) CNN layer, which represents the local context of a Chinese character of interest within a sliding window (e.g. [− 1, 1] in Fig. 1); 3) LSTM layer, which uses a forward LSTM and a backward LSTM to model a sentence to capture global context information of a sentence; 4) Attention layer, which determines relativity strength of other Chinese characters to a Chinese character of interest; 5) CRF layer, which predicts a label sequence for an input sentence by considering relations between neighbor labels

Summary

Introduction

With rapid development of electronic medical information systems, more and more electronic medical records (EMRs) are available for medical research and application. In EMRs, plenty of useful information is embedded in clinical text. A large number of methods have been proposed for clinical entity recognition. To promote development of entity recognition in Chinese clinical text, the organizers of China conference on knowledge graph and semantic computing (CCKS) launched a challenge was launched in 2017 [3]. The challenge organizer provided a dataset (called CCKS2017_CNER) with only contiguous clinical entities following the guideline of i2b2 Most studies focus on clinical text in English rather than other languages. A few researchers have began to study entity recognition in Chinese clinical text

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Apr 1, 2019
Citations: 42	License type: open-access

R Discovery Prime

R Discovery Prime

Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

A machine learning based approach to identify protected health information in Chinese clinical text
Liting Du ... Jingdong Ma
International Journal of Medical Informatics | VOL. 116
Liting Du, et. al.Liting Du ... Jingdong Ma
22 May 2018
International Journal of Medical Informatics | VOL. 116

Extracting entities with attributes in clinical text via joint deep learning.
Xue Shi ... Xiaolong Wang
Journal of the American Medical Informatics Association | VOL. 26
Xue Shi, et. al.Xue Shi ... Xiaolong Wang
24 Sep 2019
Journal of the American Medical Informatics Association | VOL. 26

PASCAL: a pseudo cascade learning framework for breast cancer treatment entity normalization in Chinese clinical text
Yang An ... Xiaopeng Wei
BMC Medical Informatics and Decision Making | VOL. 20
Yang An, et. al.Yang An ... Xiaopeng Wei
28 Aug 2020
BMC Medical Informatics and Decision Making | VOL. 20

Phenonizer: A Fine-Grained Phenotypic Named Entity Recognizer for Chinese Clinical Texts.
Qunsheng Zou ...
BioMed research international | VOL. 2022
Qunsheng Zou, et. al.Qunsheng Zou ...
23 Mar 2022
BioMed research international | VOL. 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making