Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression

Katrin Madjar,Jörg Rahnenführer,Manuela Zucknick,Katja Ickstadt

doi:10.1186/s12859-021-04483-z

Abstract

BackgroundImportant objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous.ResultsWe propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts.ConclusionsResults demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small.

Highlights

Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers
We propose an extension of the Bayesian Cox model with “spike-and-slab” prior for variable selection by Treppmann et al [35] in the sense that we incorporate graph information between covariates into variable selection via an Markov random field (MRF) prior instead of modeling the regression coefficients independently
We offer a solution for sharing information across the subgroups to increase power in variable selection and improve prediction performance

Summary

Introduction

Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. A separate prediction model is fitted using only the data of one specific cohort This can lead to a loss of power when the sample size is small. A popular alternative are “spike-and-slab” priors that use latent indicators for variable selection and a mixture distribution for the regression coefficients [14, 35]. Chakraborty and Lozano [5] propose a Graph Laplacian prior for modeling the dependence structure between the regression coefficients through their precision matrix

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2021
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Using machine learning techniques to develop risk prediction models to predict graft failure following kidney transplantation: protocol for a retrospective cohort study
Sanjeewa Kularatna ... Helen Healy
F1000Research | VOL. 8
Sanjeewa Kularatna, et. al.Sanjeewa Kularatna ... Helen Healy
02 Mar 2020
F1000Research | VOL. 8

Using machine learning techniques to develop risk prediction models to predict graft failure following kidney transplantation: protocol for a retrospective cohort study.
Sameera Senanayake ... Sanjeewa Kularatna
F1000Research | VOL. 8
Sameera Senanayake, et. al.Sameera Senanayake ... Sanjeewa Kularatna
09 Mar 2020
F1000Research | VOL. 8

Using machine learning techniques to develop risk prediction models to predict graft failure following kidney transplantation: protocol for a retrospective cohort study
Sameera Senanayake ... Helen Healy
F1000Research | VOL. 8
Sameera Senanayake, et. al.Sameera Senanayake ... Helen Healy
29 Oct 2019
F1000Research | VOL. 8

Abstract B092: Development of predictive models for expression of a tumor specific biomarker and CD3 on H&E digital slides
Alan Jerusalmi ... Krishna Bairavi
Cancer Research | VOL. 84
Alan Jerusalmi, et. al.Alan Jerusalmi ... Krishna Bairavi
04 Mar 2024
Cancer Research | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics