Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing

Yaping Zhang,Yao Shen,Mingqian Liu,Geertruida H De Bock,Beibei Jiang,Xueqian Xie,Jun Lan,Shundong Hu,Rozemarijn Vliegenthart,Xu Chen

doi:10.1038/s43856-021-00043-x

Abstract

BackgroundArtificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centersMethodsWe collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. Then, a 25-label classification system were built to train and test the CNN models with weakly supervised labeling.ResultsIn three external test cohorts of 5,996 symptomatic patients, 2,130 screening examinees, and 1,804 community clinic patients, the mean AUC of identifying 25 abnormal signs by CNN reaches 0.866 ± 0.110, 0.891 ± 0.147, and 0.796 ± 0.157, respectively. In symptomatic patients, CNN shows no significant difference with local radiologists in identifying 21 signs (p > 0.05), but is poorer for 4 signs (p < 0.05). In screening examinees, CNN shows no significant difference for 17 signs (p > 0.05), but is poorer at classifying nodules (p = 0.013). In community clinic patients, CNN shows no significant difference for 12 signs (p > 0.05), but performs better for 6 signs (p < 0.001).ConclusionWe construct and validate an effective CXR interpretation system based on natural language processing.

Highlights

Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation
Using the bidirectional encoder representations from transformers (BERT) model to extract the linguistic entities and relationships from the unstructured radiology reports, a knowledge graph was constructed to represent the relationship between CXR labels and report content, which laid the foundation for training convolutional neural networks (CNNs) with weakly supervised labeling
Due to the low incidence of abnormal signs on CXR in real-world practice, a large number of subjects were included to test the performance of the CNN model

Summary

Introduction

Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centers Methods We collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. A 25label classification system were built to train and test the CNN models with weakly supervised labeling

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Communications Medicine	Publication Date: Oct 28, 2021
Citations: 10	License type: open-access

R Discovery Prime

R Discovery Prime

Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Communications Medicine

Lead the way for us

Similar Papers

Bert model fine-tuning for text classification in knee OA radiology reports
L Chen ... V Pedoia
Osteoarthritis and Cartilage | VOL. 28
L Chen, et. al.L Chen ... V Pedoia
01 Apr 2020
Osteoarthritis and Cartilage | VOL. 28

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT)
Jia Li ... Zhenghan Yang
BMC Medical Informatics and Decision Making | VOL. 22
Jia Li, et. al.Jia Li ... Zhenghan Yang
30 Jul 2022
BMC Medical Informatics and Decision Making | VOL. 22

Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance
A W Olthof ... P M A Van Ooijen
Journal of Medical Systems | VOL. 45
A W Olthof, et. al.A W Olthof ... P M A Van Ooijen
04 Sep 2021
Journal of Medical Systems | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Communications Medicine