Semi-Supervised Noisy Label Learning for Chinese Clinical Named Entity Recognition

Zhucong Li,Yubo Chen,Kang Liu,Jing Wan,Jun Zhao,Baoli Zhang,Zhen Gan,Shengping Liu

doi:10.1162/dint_a_00099

Abstract

This paper describes our approach for the Chinese clinical named entity recognition (CNER) task organized by the 2020 China Conference on Knowledge Graph and Semantic Computing (CCKS) competition. In this task, we need to identify the entity boundary and category labels of six entities from Chinese electronic medical record (EMR). We constructed a hybrid system composed of a semi-supervised noisy label learning model based on adversarial training and a rule post-processing module. The core idea of the hybrid system is to reduce the impact of data noise by optimizing the model results. Besides, we used post-processing rules to correct three cases of redundant labeling, missing labeling, and wrong labeling in the model prediction results. Our method proposed in this paper achieved strict criteria of 0.9156 and relax criteria of 0.9660 on the final test set, ranking first.

Highlights

1.1 Evaluation TaskThis task is a continuation of the series of evaluation carried out by China Conference on Knowledge Graph and Semantic Computing (CCKS) around the semantics of Chinese electronic medical records
This paper describes our approach for the Chinese clinical named entity recognition (CNER) task organized by the 2020 China Conference on Knowledge Graph and Semantic Computing (CCKS) competition
We constructed a hybrid system composed of a semi-supervised noisy label learning model based on adversarial training and a rule post-processing module

Summary

Evaluation Task

This task is a continuation of the series of evaluation carried out by China Conference on Knowledge Graph and Semantic Computing (CCKS) around the semantics of Chinese electronic medical records. It has been extended and expanded on the basis of the relevant evaluation tasks of CCKS2017, CCKS2018, and CCKS2019. For a given set of plain text documents of electronic medical records (EMRs), this Chinese medical record MER task in 2020 is to extract entity mentions and classify them into six predefined types of entities: disease and diagnosis, imaging examination, laboratory examination, operation, drug, and anatomy

Data Set

Overview of Main Challenges and Solutions

Adversarial Training

Post-processing Rules

Sentence Segmentation

Text Normalization

Results

CONCLUSION AND FUTURE WORK

DATA AVAILABILITY STATEMENT

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Intelligence	Publication Date: Sep 8, 2021
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Semi-Supervised Noisy Label Learning for Chinese Clinical Named Entity Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Intelligence

Lead the way for us

Similar Papers

Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations.
Xiaohua Wang ... Liping Zou
JMIR medical informatics | VOL. 8
Xiaohua Wang, et. al.Xiaohua Wang ... Liping Zou
04 Sep 2020
JMIR medical informatics | VOL. 8

Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation
Weijie Wang ... Huiling Ren
JMIR Medical Informatics | VOL. 11
Weijie Wang, et. al.Weijie Wang ... Huiling Ren
10 May 2023
JMIR Medical Informatics | VOL. 11

Improving Chinese Clinical Named Entity Recognition Based on BiLSTM-CRF by Cross-Domain Transfer
Donghui Yue ... Lei Zhuang
-
Donghui Yue, et. al.Donghui Yue ... Lei Zhuang
03 Jul 2020
03 Jul 2020

Bi-level artificial intelligence model for risk classification of acute respiratory diseases based on Chinese clinical data.
Dewen Wang ... Jiewu Leng
Applied intelligence (Dordrecht, Netherlands) | VOL. 52
Dewen Wang, et. al.Dewen Wang ... Jiewu Leng
22 Feb 2022
Applied intelligence (Dordrecht, Netherlands) | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-Supervised Noisy Label Learning for Chinese Clinical Named Entity Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Intelligence