Improving the Robustness of Variable Selection and Predictive Performance of Regularized Generalized Linear Models and Cox Proportional Hazard Models.

Feng Hong,Lu Tian,Viswanath Devanarayan

doi:10.3390/math11030557

Abstract

High-dimensional data applications often entail the use of various statistical and machine-learning algorithms to identify an optimal signature based on biomarkers and other patient characteristics that predicts the desired clinical outcome in biomedical research. Both the composition and predictive performance of such biomarker signatures are critical in various biomedical research applications. In the presence of a large number of features, however, a conventional regression analysis approach fails to yield a good prediction model. A widely used remedy is to introduce regularization in fitting the relevant regression model. In particular, a penalty on the regression coefficients is extremely useful, and very efficient numerical algorithms have been developed for fitting such models with different types of responses. This -based regularization tends to generate a parsimonious prediction model with promising prediction performance, i.e., feature selection is achieved along with construction of the prediction model. The variable selection, and hence the composition of the signature, as well as the prediction performance of the model depend on the choice of the penalty parameter used in the regularization. The penalty parameter is often chosen by K-fold cross-validation. However, such an algorithm tends to be unstable and may yield very different choices of the penalty parameter across multiple runs on the same dataset. In addition, the predictive performance estimates from the internal cross-validation procedure in this algorithm tend to be inflated. In this paper, we propose a Monte Carlo approach to improve the robustness of regularization parameter selection, along with an additional cross-validation wrapper for objectively evaluating the predictive performance of the final model. We demonstrate the improvements via simulations and illustrate the application via a real dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Jan 20, 2023
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improving the Robustness of Variable Selection and Predictive Performance of Regularized Generalized Linear Models and Cox Proportional Hazard Models.

Abstract

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Why you should stop predicting customer churn and start using uplift models
Floris Devriendt ... Wouter Verbeke
Information Sciences | VOL. 548
Floris Devriendt, et. al.Floris Devriendt ... Wouter Verbeke
31 Dec 2020
Information Sciences | VOL. 548

Comparison of Some Prediction Models and their Relevance in the Clinical Research
Nihar Ranjan Panda ... Jitendra Kumar Pati
International Journal of Statistics in Medical Research | VOL. 12
Nihar Ranjan Panda, et. al.Nihar Ranjan Panda ... Jitendra Kumar Pati
08 Mar 2023
International Journal of Statistics in Medical Research | VOL. 12

A simulation study of sample size demonstrated the importance of the number of events per variable to develop prediction models in clustered data
L Wynants ... Y Vergouwe
Journal of Clinical Epidemiology | VOL. 68
L Wynants, et. al.L Wynants ... Y Vergouwe
13 Feb 2015
Journal of Clinical Epidemiology | VOL. 68

Study on the Influence of the Number of Features on the Performance of Software Defect Prediction Model
Mengtian Cui ... Yang Lu
-
Mengtian Cui, et. al.Mengtian Cui ... Yang Lu
05 Jul 2019
05 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the Robustness of Variable Selection and Predictive Performance of Regularized Generalized Linear Models and Cox Proportional Hazard Models.

Abstract

Talk to us

Similar Papers

More From: Mathematics