Information Extraction for Intestinal Cancer Electronic Medical Records

Sufen Wang,Junyi Yuan,Changqing Pan,Bo Xu,Hong Zhang,Ming Du,Minmin Pang

doi:10.1109/access.2020.3005684

Sufen Wang, Junyi Yuan + Show 5 more

Open Access

https://doi.org/10.1109/access.2020.3005684

Copy DOI

Abstract

The data generated by the structured electronic medical records is helpful for mining and extracting medical data, and it is an effective way to make effective use of valuable data resources. However, the hospitals have accumulated a large number of unstructured data in electronic medical records, which cannot be effectively searched, resulting in serious waste of resources. In this paper, we study the problem of extracting attribute values from the unstructured text in electronic medical records. By observing intestinal cancer diagnostic texts, our attributes have two categories - discriminative attributes and extractive attributes, which use the text classification and the sequence labeling to tackle attribute values extraction problems. For discriminative attributes, we firstly divide the text into sentences/segments as instances. Secondly, we fine-tune the pre-trained word embedding to capture domain-specific semantics/knowledge. Thirdly, we also use an attention mechanism to select the most important instance for different attribute extractors. Finally, multi-tasking learning is used to share useful information to get better experimental results. For extractive attributes, we propose a novel model to get attribute values, including the BiLSTM layer, the CNN layer and the CRF layer. In particular, we use BiLSTM and CNN to learn text features and CRF as the last layer of the model. Experiments have shown that our method is superior to several competitive baseline methods.

Highlights

With the continuous development of science and technology, the research results on data have been gradually applied to various domains
We focus on extract both the discriminative and extractive attributes, which is more practice in a real-world applications
In this paper, we use the pre-trained word embedding to better initialize the parameters of our models, we fine-tune them by using our domain corpus to capture domain-specific semantics/knowledge

Summary

Introduction

With the continuous development of science and technology, the research results on data have been gradually applied to various domains. The data of the Electronic Medical Records (EMR) system has attracted the attention of researchers and has become the main issue of research. The EMR data contains a large number of patients’ basic information, condition diagnosis reports and medical knowledge, which are valuable wealth in the medical domain. Only structured data can serve medical research. The main work of this paper are to transform the unstructured intestinal cancer diagnostic text into structured

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Information Extraction for Intestinal Cancer Electronic Medical Records

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Multi-task Learning for Attribute Extraction from Unstructured Electronic Medical Records
Ming Du ... Minmin Pang
-
Ming Du, et. al.Ming Du ... Minmin Pang
01 Jan 2020
01 Jan 2020

Assessing the representativeness of trial populations for contemporary real-world cancer patients: A case study of adjuvant chemotherapy for colon cancer.
Jennifer Leigh Lund ... Emily W Bratton
Journal of Clinical Oncology | VOL. 39
Jennifer Leigh Lund, et. al.Jennifer Leigh Lund ... Emily W Bratton
01 Oct 2021
Journal of Clinical Oncology | VOL. 39

Compliance (COMP) with colon cancer (CoC) national guidelines (NG) for chemotherapy (CT) by implementing electronic medical records (EMR)
...
Journal of Clinical Oncology | VOL. 26
, et. al. ...
20 May 2008
Journal of Clinical Oncology | VOL. 26

Multi-task heterogeneous graph learning on electronic health records
Tsai Hor Chan ... Lequan Yu
Neural Networks | VOL. 180
Tsai Hor Chan, et. al.Tsai Hor Chan ... Lequan Yu
22 Aug 2024
Neural Networks | VOL. 180

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information Extraction for Intestinal Cancer Electronic Medical Records

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access