Abstract 4965: Detection of status of cancer in radiology notes using artificial intelligence

Ankur Arya,Andrew Niederhausern,Nadia S Bahadur,John Philip,Chelsea Nichols,Avijit Chatterjee,Neil J Shah

doi:10.1158/1538-7445.am2024-4965

Abstract

Abstract Introductory Statement: The goal is to use an AI model to replicate the human curator’s response on cancer status in radiology notes of ~20,000 cancer patients. Introduction: MSKCC currently has ~100,000 patients with genomic testing (IMPACT) and continues to accrue more. Clinicians use this genomic data for research but lack clinical structured data to analyze alongside the genomic data. We use a vendor called VASTA Global to manually curate unstructured paragraph text. Depending on the data model, we found that a patient’s full cancer history can take up to 8 hours to curate. Therefore, we want to implement AI to automate the curation of this data to save time and cost. We hope to achieve a faster curation process that allows us to accomplish more than 1 patient a day to catch up to the 100,000 MSK-IMPACT cohort. To decrease the average curation time per patient, we investigated the use of an NLP model to replicate manual curation of the PRISSMM™ ontology field for change in cancer status from radiology reports. Manual curation of radiology reports is time intensive due to curators understanding their cancer status and the volume of scans patients undergo. Methods:A pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is fine-tuned using training data using GPUs in IBM Cloud Pak for Data (CPD) platform. The hyperparameters were adjusted using accuracy and F1 metric of evaluation data. Using one vs. rest approach the model is evaluated on held out test data with results shown in table 1.Table 1 includes class 1-5: Progressing/worsening/enlarging, stable/no change, improving/responding, not stated/indeterminate or mixed. Summary:The weighted AUCROC ~ 0.97, F1 ~ 0.85 and accuracy ~ 93%. These metric scores improve by choosing notes with higher class probabilities only. Conclusion:Due to the above 0.9 calculated accuracy this NLP model is successful in replicating the curated results of cancer status and the next steps will be to run this model on a new cohort of patients not yet curated. Cancer Status Metrics Class Prevalence Precision Recall F1 Score AUCROC Accuracy 1 38.9% 88.7% 87.0% 87.9% 96.5% 90.7% 2 24.7% 89.9% 88.2% 89.1% 98.3% 94.6% 3 15.9% 87.6% 90.7% 89.1% 98.7% 96.5% 4 15.7% 77.0% 72.6% 74.7% 94.4% 92.3% 5 4.8% 68.1% 74.4% 71.1% 97.9% 97.1% Weighted Average 85% 86% 85% 97% 93% Citation Format: Ankur Arya, Andrew Niederhausern, Nadia S. Bahadur, Neil J. Shah, Chelsea Nichols, Avijit Chatterjee, John Philip. Detection of status of cancer in radiology notes using artificial intelligence [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 4965.

Full Text