Statistical Inference for Variable Importance

Mark J Van Der Laan

doi:10.2202/1557-4679.1008

Abstract

Many statistical problems involve the learning of an importance/effect of a variable for predicting an outcome of interest based on observing a sample of $n$ independent and identically distributed observations on a list of input variables and an outcome. For example, though prediction/machine learning is, in principle, concerned with learning the optimal unknown mapping from input variables to an outcome from the data, the typical reported output is a list of importance measures for each input variable. The approach in prediction has been to learn the unknown optimal predictor from the data and derive, for each of the input variables, the variable importance from the obtained fit. In this article we propose a new approach which involves for each variable separately 1) defining variable importance as a real valued parameter, 2) deriving the efficient influence curve and thereby optimal estimating function for this parameter in the assumed (possibly nonparametric) model, and 3) develop a corresponding double robust locally efficient estimator of this variable importance, obtained by substituting for the nuisance parameters in the optimal estimating function data adaptive estimators. We illustrate this methodology in the context of prediction, and obtain in this manner double robust locally optimal estimators of marginal variable importance, accompanied with p-values and confidence intervals. In addition, we present a model based and machine learning approach to estimate covariate-adjusted variable importance. Finally, we generalize this methodology to variable importance parameters for time-dependent variables.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Statistical Inference for Variable Importance

Abstract

Talk to us

Similar Papers

More From: The International Journal of Biostatistics

Lead the way for us

Journal: The International Journal of Biostatistics	Publication Date: Aug 25, 2005
Citations: 126

Similar Papers

Optimal estimation of several linear parameters in the presence of Lorentzian thermal noise
Jason H Steffen ... Michael W Moore
Classical and Quantum Gravity | VOL. 26
Jason H Steffen, et. al.Jason H Steffen ... Michael W Moore
07 Sep 2009
Classical and Quantum Gravity | VOL. 26

Blood Pressure Variability
Eamon Dolan ... Eoin O'Brien
Hypertension | VOL. 56
Eamon Dolan, et. al.Eamon Dolan ... Eoin O'Brien
06 Jul 2010
Hypertension | VOL. 56

Reduced-order optimal state estimator for linear systems with partially noise corrupted measurement
E Fogel ... Y Huang
IEEE Transactions on Automatic Control | VOL. 25
E Fogel, et. al.E Fogel ... Y Huang
01 Oct 1980
IEEE Transactions on Automatic Control | VOL. 25

Visualization of Explainable Artificial Intelligence Techniques Using Variable Importance with Its Applications to Health Information Data
Hyerin Jeong ... Junghoon Park
Journal of Health Informatics and Statistics | VOL. 45
Hyerin Jeong, et. al.Hyerin Jeong ... Junghoon Park
30 Nov 2020
Journal of Health Informatics and Statistics | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Statistical Inference for Variable Importance

Abstract

Talk to us

Similar Papers

More From: The International Journal of Biostatistics