Toward automating RECIST 1.1: Improving AI new lesion detection with longitudinal image data.

Amanda Gong,Gabriel Melendez-Corres,Matthew S Brown,Wahi-Anwar Muhammad,Kathleen Ruchalski,Morgan Daly,Jonathan Goldin,Tanmay Sanjay Hukkeri,Pang Yu Teng

doi:10.1200/jco.2023.41.16_suppl.e13545

Amanda Gong, Gabriel Melendez-Corres + Show 7 more

https://doi.org/10.1200/jco.2023.41.16_suppl.e13545

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

e13545 Background: Automating RECIST 1.1 would offer benefits in speed, consistency, and cost of clinical trial imaging reads. Detection of new lesions is subjective and a common source of reader adjudication. Current AI lesion detection typically uses single timepoint images. The use of temporal data in AI is a rapidly evolving field in non-medical applications and new lesion detection in response assessment may benefit from longitudinal multi timepoint images. We explore the impact of longitudinal image input to deep neural networks for new lesion detection; as a proof of concept, we compare single vs dual timepoint input in new liver lesion detection in CT. Methods: We utilized CT images from two public datasets: 1) diseased liver (DL), a subset of DeepLesion (doi: 10.1117/1.JMI.5.3.036501), and 2) healthy liver (HL) from potential liver donors (Med Image Anal 69 (2021) 101950).. Due to the lack of a publicly available longitudinal image dataset with new lesions, we simulated (“sim”) a longitudinal study by registering HL scans (sim healthy baseline) with DL scans (sim follow up with new lesion) and selecting well-paired images using mutual information. From each pair of scans, we select one positive (with new lesion on follow up) and one negative (without lesion on follow up) 6x6cm image patch from within the liver, creating a dataset of 3296 patches with a balanced number of pos/neg samples. We trained two deep neural networks to classify the presence of a new liver lesion: 1) a conventional single timepoint model (ST) trained on DL patches only, i.e. using the follow up scan only, and 2) an experimental dual-timepoint model (DT) trained on patch pairs from DL-HL paired scans. We used 5-fold cross validation and paired t tests to compare model AUC and accuracy. Results: Of 10,594 CT studies (4,427 patients, pts) in DeepLesion, we paired 1648 DL scans (946 pts with liver lesions) with one of 20 registered HL scans (20 pts). The DT model outperformed the ST model, increasing AUC from 0.914 ± 0.002 to 0.929 ± 0.005 (p=0.040), with similar trends in accuracy (dataset split, Table 1). Conclusions: The DT model outperformed the ST model in the detection of new liver lesions, suggesting preliminary the benefit of multi-timepoint input to a deep neural network using sim longitudinal data. The use of longitudinal image data to improve interval change detection is a step towards automating radiologic response assessment in clinical trials. We plan to extend this work to automate new lesion detection in real clinical trial longitudinal images. Results (bold) are reported on test set across 5 folds. N image patches (ST) or patch pairs (DT). [Table: see text]

Full Text