Abstract

108 Background: Classification of disease response is an essential task in cancer research and needs to be done at scale. Automating this process can improve efficiency in the generation of real-world evidence, potentially leading to better patient outcomes. We aim to develop and evaluate Natural Language Processing (NLP) models for this task. Methods: Using 6203 computed tomography (CT) and 1358 magnetic resonance imaging (MRI) reports from 587 patients with lung cancer of all stages seen at the National Cancer Centre Singapore (NCCS), we trained four NLP models (BioBERT, RadBERT-RoBERTA, BioClinicalBERT, GatorTron) to classify the reports into one of four categories: no evidence of disease, stable disease, partial response or disease progression. Model output was compared against human-curated ground truth and performance was evaluated by accuracy. Results: Of the 4 models, GatorTron performed the best (accuracy = 97.1%), followed by RadBERT-RoBERTA (accuracy = 96.2%), BioBERT (accuracy = 94.2%), with BioClinicalBERT being last (accuracy = 90.4%). NLP Model runtimes for the dataset were relatively short, with BioBERT and BioClinicalBERT taking 3 minutes per epoch, RadBERT-RoBERTA taking 6 minutes per epoch, and GatorTron taking 10 minutes per epoch on a single central processing unit (CPU). Conclusions: We have demonstrated the effectiveness of NLP models for classifying disease responses in radiology reports of lung cancer patients. This has the potential to help derive progression-free survival for real-world evidence generation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call