Abstract

BackgroundImportant objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous.ResultsWe propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts.ConclusionsResults demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small.

Highlights

  • Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers

  • We propose an extension of the Bayesian Cox model with “spike-and-slab” prior for variable selection by Treppmann et al [35] in the sense that we incorporate graph information between covariates into variable selection via an Markov random field (MRF) prior instead of modeling the regression coefficients independently

  • We offer a solution for sharing information across the subgroups to increase power in variable selection and improve prediction performance

Read more

Summary

Introduction

Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. A separate prediction model is fitted using only the data of one specific cohort This can lead to a loss of power when the sample size is small. A popular alternative are “spike-and-slab” priors that use latent indicators for variable selection and a mixture distribution for the regression coefficients [14, 35]. Chakraborty and Lozano [5] propose a Graph Laplacian prior for modeling the dependence structure between the regression coefficients through their precision matrix

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.