In oncology trials, response evaluation criteria are pivotal in developing new treatments. This study examines the influence of measurement variability in brain lesions on response classification, considering long-standing cut-offs for progression and response were determined before the era of submillimeter resolutions of medical imaging. We replicate a key study using modern radiological tools. Sixteen radiologists were tasked with measuring twelve near-spherical brain tumors using visual estimation (eyeballing), diameter measurements and artificial intelligence (AI) assisted segmentations. Analyses for inter- and intraobserver variability from the original were replicated. Additionally, we researched the effect of measurement error on the misclassification of progressive disease using a computer simulation model. The combined effect of intra- and interobserver error varied between 13.6 and 22.2% for eyeballing and 6.8-7.2% for diameter measurement, using AI-assisted segmentation as reference. We observed erroneously declared progression (cut-off at 20% increase) in repeat measurements of the same tumor in 25.5% of instances for eyeballing and in 1.1% for diameter measurements. Response (cut-off at 30% decrease) was erroneously declared in 12.3% for eyeballing and in 0% for diameter measurements. The simulation model demonstrated a more pronounced impact of measurement error on cases with fewer total number of lesions. This study provides a minimum expected measurement error using real-world data. The impact of measurement error on response evaluation criteria misclassification in brain lesions was most pronounced for eyeballing. Future research should focus on measurement error for different tumor types and assess its impact on response classification during patient follow-up.
Read full abstract