Abstract

Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method.

Highlights

  • In the era of precision medicine, constructing interpretable and accurate predictive models, based on patients’ demographic characteristics, clinical conditions and molecular biomarkers, has been crucial for disease prevention, early diagnosis and targeted therapy [1]

  • By varying γ in forward regression (FR) and sequentially conditioning (SC) between γL and γ H, we explored the impact of γ on inducing false positives and negatives

  • We evaluated the performances of the methods by the criteria of true positives (TP), false positives (FP), the estimated probability of including the true models (PIT), the mean squared error (MSE) of βand the mean squared prediction error (MSPE)

Read more

Summary

Introduction

In the era of precision medicine, constructing interpretable and accurate predictive models, based on patients’ demographic characteristics, clinical conditions and molecular biomarkers, has been crucial for disease prevention, early diagnosis and targeted therapy [1]. In ultrahigh dimensional settings where p is in the exponential order of n, penalized methods may incur computational challenges [3], may not reach globally optimal solutions and often generate biased estimates [4]. The method relies on a partial faithfulness assumption, which stipulates that jointly important variables must be marginally important, an assumption that may not be always realistic. To relieve this condition, some iterative procedures, such as ISIS [5], have been adopted to repeatedly screen variables based on the residuals from the previous iterations, but with heavy computation and unclear theoretical properties. Screening methods do not directly generate a final model, and post-screening regularization methods, such as LASSO, are recommended by [5] to produce a final model

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call