Development and validation of deep learning and BERT models for classification of lung cancer radiology reports

S Mithun,Ashish Kumar Jha,Umesh B Sherkhane,Vinay Jaiswar,Nilendu C Purandare,V Rangarajan,A Dekker,Sander Puts,Inigo Bermejo,L Wee

doi:10.1016/j.imu.2023.101294

Abstract

PurposeManual cohort building from radiology reports can be tedious. Natural Language Processing (NLP) can be used for automated cohort building. In this study, we have developed and validated an NLP approach based on deep learning (DL) to select lung cancer reports from a thoracic disease management group cohort. Materials and methods4064 radiology reports (CT and PET/CT) of a thoracic disease management group reported between 2014 and 2016 were used. These reports were anonymised, cleaned, text normalized and split into a training, testing, and validation set. External validation was performed on radiology reports from the MIMIC-III clinical database. We used three DL models, namely, Bi-LSTM_simple, Bi-LSTM_dropout, and Pre-trained _BERT model to predict if a report concerned lung cancer. We studied the effect of minority oversampling on all models. ResultsWithout oversampling, the F1 scores at 95% CI for Bi-LSTM_simple, Bi-LSTM_dropout and BERT were 0.89, 0.90, and 0.86; with oversampling, the F1 scores were 0.94, 0.94, and 0.9, on internal validation. On external validation the F1-scores of Bi-LSTM_simple, Bi-LSTM_dropout and BERT models were 0.63, 0.77 and 0.80 without oversampling and 0.72, 0.78 and 0.77 with oversampling. ConclusionPre-trained BERT model and Bi-LSTM_dropout models to predict a lung cancer report showed consistent performance on internal and external validation with the BERT model exhibiting superior performance. The overall F1 score decreased on external validation for both Bi-LSTM models with the Bi-LSTM_simple model showing a more significant drop. All models showed some improvement on minority oversampling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Informatics in Medicine Unlocked	Publication Date: Jan 1, 2023
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Development and validation of deep learning and BERT models for classification of lung cancer radiology reports

Abstract

Talk to us

Similar Papers

More From: Informatics in Medicine Unlocked

Lead the way for us

Similar Papers

Deep learning-based diagnosis of osteoblastic bone metastases and bone islands in computed tomograph images: a multicenter diagnostic study.
Yuchao Xiong ... Fan Xu
European radiology | VOL. 33
Yuchao Xiong, et. al.Yuchao Xiong ... Fan Xu
15 Apr 2023
European radiology | VOL. 33

Deep learning site classification model for automated photodocumentation in upper GI endoscopy (with video)
Liang Yen Liu ... Cadman L Leggett
iGIE | VOL. 2
Liang Yen Liu, et. al.Liang Yen Liu ... Cadman L Leggett
14 Feb 2023
iGIE | VOL. 2

Performance of deep learning models constructed using panoramic radiographs from two hospitals to diagnose fractures of the mandibular condyle.
Masako Nishiyama ... Akitoshi Katsumata
Dentomaxillofacial Radiology | VOL. 50
Masako Nishiyama, et. al.Masako Nishiyama ... Akitoshi Katsumata
26 Mar 2021
Dentomaxillofacial Radiology | VOL. 50

Predicting muscle invasion in bladder cancer by deep learning analysis of MRI: comparison with vesical imaging-reporting and data system.
Jianpeng Li ... Zhaohong Pan
European Radiology | VOL. 33
Jianpeng Li, et. al.Jianpeng Li ... Zhaohong Pan
25 Nov 2022
European Radiology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Development and validation of deep learning and BERT models for classification of lung cancer radiology reports

Abstract

Talk to us

Similar Papers

More From: Informatics in Medicine Unlocked