Abstract
BackgroundImportant objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous.ResultsWe propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts.ConclusionsResults demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small.
Highlights
Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers
We propose an extension of the Bayesian Cox model with “spike-and-slab” prior for variable selection by Treppmann et al [35] in the sense that we incorporate graph information between covariates into variable selection via an Markov random field (MRF) prior instead of modeling the regression coefficients independently
We offer a solution for sharing information across the subgroups to increase power in variable selection and improve prediction performance
Summary
Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. A separate prediction model is fitted using only the data of one specific cohort This can lead to a loss of power when the sample size is small. A popular alternative are “spike-and-slab” priors that use latent indicators for variable selection and a mixture distribution for the regression coefficients [14, 35]. Chakraborty and Lozano [5] propose a Graph Laplacian prior for modeling the dependence structure between the regression coefficients through their precision matrix
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.