Abstract
Clinical Named Entity Recognition (CNER) focuses on locating named entities in electronic medical records (EMRs) and the obtained results play an important role in the development of intelligent biomedical systems. In addition to the research in alphabetic languages, the study of non-alphabetic languages has attracted considerable attention as well. In this paper, a neural model is proposed to address the extraction of entities from EMRs written in Chinese. To avoid erroneous noise being caused by the Chinese word segmentation, we employ the character embeddings as the only feature without extra resources. In our model, concatenated n-gram character embeddings are used to represent the context semantics. The self-attention mechanism is then applied to model long-range dependencies of embeddings. The concatenation of the new representations obtained by the attention module is taken as the input to bidirectional long short-term memory (BiLSTM), followed by a conditional random field (CRF) layer to extract entities. The empirical study is conducted on the CCKS-2017 Shared Task 2 dataset to evaluate our method and the experimental results show that our model outperforms other approaches.
Highlights
With the rapid development of information technology, medical institutions have widely used electronic medical records (EMRs) to facilitate data collection which includes patient health information, diagnostic tests, procedures performed and clinical decision making
There are three main subtasks in the Biomedical Information Extraction (BioIE): (1) Named Entity Recognition (NER) which aims to categorize entity names in clinical and biomedical domains, (2) Relation Extraction (RE) which targets the detection of semantic relations between entities and (3) Event Extraction (EE) which explores a more detailed alternative to produce a formal representation to extract the knowledge within the targeted documents [2]
We propose an Att-bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) model to perform the Chinese Clinical Named Entity Recognition (CNER) task based on combinations of n-gram character embeddings of different lengths without using external knowledge
Summary
Citation: Lin, C.-S.; Jwo, J.-S.; Lee, C.-H. A Neural N-Gram-Based Classifier for Chinese Clinical Named Entity Recognition. Appl. Sci. 2021, 11, 8682. https://doi.org/10.3390/ app11188682 Academic Editors: Julian Szymanski, Andrzej Sobecki, Higinio Mora and Doina Logofătu Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.