Abstract

Background Clinical validation of a predictive biomarker is especially difficult when the biomarker cannot be assessed retrospectively. A cost-effective, prospective multicenter replication study with rapid accrual is warranted prior to further validation studies such as a marker-based strategy for treatment selection. However, it is often unknown how measurement error and bias in a multicenter trial will differ from that in single-institution studies. Purpose Power calculations using simulated data may inform the efficient design of a multicenter study to replicate single-institution findings. This case study used serial standardized uptake value (SUV) measures from 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) to predict early response to breast cancer neoadjuvant chemotherapy. We examined the impact of accelerating accrual through increased inclusion of secondary sites with greater levels of measurement error and bias. We also examined whether enrichment designs based on breast cancer initial uptake could increase the study power for a fixed budget (200 total scans). Methods Reference FDG PET SUV data were selected with replacement from a single-institution trial; pathologic complete response (pCR) data were simulated using a logistic regression model predicting response by mid-therapy percent change in SUV. The impact of increased error for SUV measurements in multicenter trials was simulated by sampling from error and bias distributions: 20%−40% measurement error, 0%−40% bias, and fixed error/bias values. The proportion of patients recruited from secondary sites (with higher additional error/bias compared to primary sites) varied from 25% to 75%. Results Reference power (from source data with no added error) was 0.92 for N = 100 to detect an association between percentage change in SUV and response. With moderate (20%) simulated measurement error for 3/4, 1/2, and 1/4 of measurements and 40% for the remainder, power was 0.70, 0.61, and 0.53, respectively. Reduction of study power was similar for other manifestations of measurement error (bias as a percentage of true value, absolute error, and absolute bias). Enrichment designs, which recruit additional patients by not conducting a second scan in patients with unsuitable pre-therapy uptake (low baseline SUV), did not lead to greater power for studies constrained to the same total cost. Limitations Simulation parameters could be incorrect, or not generalizable. Under a different logistic regression model relating mid-therapy percent change in SUV to pCR (with no relationship for patients with low baseline SUV, rather than the modest point estimate from reference data), the enrichment design did have somewhat greater power than the unselected design. Conclusion Even moderate additional measurement error substantially reduced study power under both unselected and enrichment designs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.