Abstract Objective: High-throughput metabolomics assays can generate thousands of biomarker measurements and provide novel opportunities for prognostic modeling in colorectal cancer (CRC) research. The high dimensionality of metabolic data brings unforeseen statistical challenges and traditional variable selection methods may not perform well due to simultaneous challenges of computational expediency, statistical accuracy, and algorithmic stability. The Iterative Sure Independence Screening (ISIS) has demonstrated superior theoretical properties in handling such situations, and may be a viable alternative. Methods: In a prospective study of 77 newly diagnosed CRC patients (stage I-IV), pre-surgical urinary samples were analyzed on a gas chromatography-mass spectrometry platform. After exclusion of metabolites with >30% coefficient of variation, 168 metabolites remained for statistical analysis. Raw measures were processed following a standard normalization pipeline. The primary outcome was overall survival (OS) as measured from date of cancer diagnosis. In addition to metabolomics data, the predictor set included baseline clinical characteristics, such as age, sex, body mass index, tumor site, tumor stage, and receipt of neo-adjuvant and/or adjuvant treatment. We applied the ISIS method with Lasso penalty on a Cox regression model (ISIS-Lasso) to identify features associated with OS. Cox models with either Lasso regularization or backward selection were also considered as competing methods. The performance of the models was assessed through two standard performance matrices: Uno's time-dependent Area Under the Curve (tAUC) and Brier's score, both with resample-based validation. Results: Based on bootstrapped tAUC curves, we demonstrated that the screening step in ISIS can significantly improve model performance, since its predicted mean (and median) AUC are larger across clinically relevant follow-up time points (2 and 5 years after diagnosis) relative to the corresponding measures from the other two models. The prediction error based on Brier's score with 0.632+ bootstrap from ISIS-Lasso was also noticeably lower compared to the model from backward selections. Based on the ISIS-Lasso model, we identified two features, that were predictive of OS in CRC patients: tumor stage and cystine. When fixing all other clinical measures, patients with early stage (I-III) had 52% lower risk of death, compared to late stage (IV); 1 standard deviation of increase of cystine level was associated with 62% increased risk of death. Conclusion: We have demonstrated the feasibility and effectiveness of an ISIS-based method to improve selection of prognostic models derived from metabolomics data. This may be especially useful for studies with moderate sample sizes. We have identified cystine as a potentially important prognostic biomarker. Citation Format: Tengda Lin, Biljana Gigic, Kenneth Boucher, Ben Haaland, David Liesenfeld, Robert Owen, Petra Schrotz-King, Jourgen Boehm, Anita Peoples, Augustin Scalbert, Martin Schneider, Jane Figueiredo, William Grady, Christopher Li, David Shibata, Erin Siegel, Adetunji Toriola, Alexis Ulrich, Neli Ulrich, Jincheng Shen, Jennifer Ose. Application of iterative sureindependence screening to improve urinary metabolomics-based prediction of survival in colorectal cancer patients [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 892.
Read full abstract