Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data

R M Simon,S Menezes,J Subramanian,M.-C Li

doi:10.1093/bib/bbr001

Abstract

Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell's concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data

Abstract

Talk to us

Similar Papers

More From: Briefings in Bioinformatics

Lead the way for us

Journal: Briefings in Bioinformatics	Publication Date: Feb 15, 2011
Citations: 208

Similar Papers

Clinical epidemiology and individualized medicine
Robin Henderson ... Martin Schumacher
Biometrical Journal | VOL. 53
Robin Henderson, et. al.Robin Henderson ... Martin Schumacher
11 Feb 2011
Biometrical Journal | VOL. 53

Survival Analysis Based Qos Recommendation for Bus Transportation using Deep Learning
Himabindu N ... Nagendrababu N C
International Journal For Multidisciplinary Research | VOL. 6
Himabindu N , et. al.Himabindu N ... Nagendrababu N C
01 Apr 2024
International Journal For Multidisciplinary Research | VOL. 6

Survival Prediction with Extreme Learning Machine, Supervised Principal Components and Regularized Cox Models in High-Dimensional Survival Data by Simulation
Fulden Cantaş Türki̇ş ... İmran Kurt Omurlu
GAZI UNIVERSITY JOURNAL OF SCIENCE | VOL. -
Fulden Cantaş Türki̇ş, et. al.Fulden Cantaş Türki̇ş ... İmran Kurt Omurlu
06 Jul 2023
GAZI UNIVERSITY JOURNAL OF SCIENCE | VOL. -

Combination of PCA with SMOTE Oversampling for Classification of High-Dimensional Imbalanced Data
Guhdar A A Mulla ... Yıldırım Demi̇r
Bitlis Eren Üniversitesi Fen Bilimleri Dergisi | VOL. 10
Guhdar A A Mulla, et. al.Guhdar A A Mulla ... Yıldırım Demi̇r
17 Sep 2021
Bitlis Eren Üniversitesi Fen Bilimleri Dergisi | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data

Abstract

Talk to us

Similar Papers

More From: Briefings in Bioinformatics