Abstract

Clinical Named Entity Recognition (CNER) focuses on locating named entities in electronic medical records (EMRs) and the obtained results play an important role in the development of intelligent biomedical systems. In addition to the research in alphabetic languages, the study of non-alphabetic languages has attracted considerable attention as well. In this paper, a neural model is proposed to address the extraction of entities from EMRs written in Chinese. To avoid erroneous noise being caused by the Chinese word segmentation, we employ the character embeddings as the only feature without extra resources. In our model, concatenated n-gram character embeddings are used to represent the context semantics. The self-attention mechanism is then applied to model long-range dependencies of embeddings. The concatenation of the new representations obtained by the attention module is taken as the input to bidirectional long short-term memory (BiLSTM), followed by a conditional random field (CRF) layer to extract entities. The empirical study is conducted on the CCKS-2017 Shared Task 2 dataset to evaluate our method and the experimental results show that our model outperforms other approaches.

Highlights

  • With the rapid development of information technology, medical institutions have widely used electronic medical records (EMRs) to facilitate data collection which includes patient health information, diagnostic tests, procedures performed and clinical decision making

  • There are three main subtasks in the Biomedical Information Extraction (BioIE): (1) Named Entity Recognition (NER) which aims to categorize entity names in clinical and biomedical domains, (2) Relation Extraction (RE) which targets the detection of semantic relations between entities and (3) Event Extraction (EE) which explores a more detailed alternative to produce a formal representation to extract the knowledge within the targeted documents [2]

  • We propose an Att-bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) model to perform the Chinese Clinical Named Entity Recognition (CNER) task based on combinations of n-gram character embeddings of different lengths without using external knowledge

Read more

Summary

A Neural N-Gram-Based Classifier for Chinese Clinical Named Entity Recognition

Citation: Lin, C.-S.; Jwo, J.-S.; Lee, C.-H. A Neural N-Gram-Based Classifier for Chinese Clinical Named Entity Recognition. Appl. Sci. 2021, 11, 8682. https://doi.org/10.3390/ app11188682 Academic Editors: Julian Szymanski, Andrzej Sobecki, Higinio Mora and Doina Logofătu Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction
Related Work
The Proposed Approach
Neural Entity Recognition Model
Experiment and Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call