A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging

Huanyao Zhang,Shaolei Li,Xudong Lu,Nan Wu,Danqing Hu,Huilong Duan

doi:10.1186/s12911-021-01575-x

Abstract

BackgroundComputed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical reports is a critical barrier to use this data more effectively. In this study, we investigate a novel deep learning method to extract entities from Chinese CT reports for lung cancer screening and TNM staging.MethodsThe proposed approach presents a new named entity recognition algorithm, namely the BERT-based-BiLSTM-Transformer network (BERT-BTN) with pre-training, to extract clinical entities for lung cancer screening and staging. Specifically, instead of traditional word embedding methods, BERT is applied to learn the deep semantic representations of characters. Following the long short-term memory layer, a Transformer layer is added to capture the global dependencies between characters. Besides, pre-training technique is employed to alleviate the problem of insufficient labeled data.ResultsWe verify the effectiveness of the proposed approach on a clinical dataset containing 359 CT reports collected from the Department of Thoracic Surgery II of Peking University Cancer Hospital. The experimental results show that the proposed approach achieves an 85.96% macro-F1 score under exact match scheme, which improves the performance by 1.38%, 1.84%, 3.81%,4.29%,5.12%,5.29% and 8.84% compared to BERT-BTN, BERT-LSTM, BERT-fine-tune, BERT-Transformer, FastText-BTN, FastText-BiLSTM and FastText-Transformer, respectively.ConclusionsIn this study, we developed a novel deep learning method, i.e., BERT-BTN with pre-training, to extract the clinical entities from Chinese CT reports. The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research.

Highlights

Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decisionmaking and further academic study
The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research
We proposed a novel deep learning approach, namely Bidirectional Encoder Representations from Transformers (BERT)-based-Bi-directional long short-term memory (BiLSTM)-Transformer network (BERT-BiLSTM-transformer network (BTN)) with pre-training, to extract 14 types of clinical entities from chest CT reports for lung cancer screening and TNM staging

Summary

Introduction

Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decisionmaking and further academic study. Computed tomography (CT), as the primary examination of lung cancer, reports a large volume of valuable information about patients’ conditions and the interpretations from radiologists, which can be used for clinical diagnosis and progression assessment. Valuable, simplified artificial rules can hardly cover all language phenomena, and intricate rules are difficult to update and maintain and often lead to poor generalization and portability [11]. To alleviate these problems, many researchers turned to machine learning algorithms, e.g., support vector machines (SVM), Conditional Random Fields (CRF), and achieved great power for NER [12,13,14,15]. The performance of these statistical methods heavily relies on predefined features, which can hardly cover all useful semantic representations for recognition, resulting in poor discriminatory ability of the model [16]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Jul 1, 2021
Citations: 10	License type: open-access

R Discovery Prime

R Discovery Prime

A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Automatic Extraction of Lung Cancer Staging Information From Computed Tomography Reports: Deep Learning Approach.
Danqing Hu ... Xudong Lu
JMIR medical informatics | VOL. 9
Danqing Hu, et. al.Danqing Hu ... Xudong Lu
21 Jul 2021
JMIR medical informatics | VOL. 9

Endoscopic and Endobronchial Ultrasonography According to the Proposed Lymph Node Map Definition in the Seventh Edition of the Tumor, Node, Metastasis Classification for Lung Cancer
Kurt G Tournoy ... Jan P Van Meerbeeck
Journal of Thoracic Oncology | VOL. 4
Kurt G Tournoy, et. al.Kurt G Tournoy ... Jan P Van Meerbeeck
01 Dec 2009
Journal of Thoracic Oncology | VOL. 4

HDAC1 is indirectly involved in the epigenetic regulation of p38 MAPK that drive the lung cancer progression.
...
European review for medical and pharmacological sciences | VOL. 22
, et. al. ...
01 Sep 2018
European review for medical and pharmacological sciences | VOL. 22

MA20.06 Lung Cancer Screening Pilot for People at High Risk: Early Results on Cancer Detection and Staging
G Darling ... M Tammemagi
Journal of Thoracic Oncology | VOL. 13
G Darling, et. al.G Darling ... M Tammemagi
01 Oct 2018
MA20.06 Lung Cancer Screening Pilot for People at High Risk: Early Results on Cancer Detection and Staging
G Darling ... M Tammemagi

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making