Abstract

We propose a variable importance measure called partial quantile utility (PQU). We then introduce a quantile forward regression algorithm (QFR) that uses PQU-based ranking to screen important variables from a potential set whose dimension can be substantially larger than the sample size. We prove that QFR-based screening can identify all the important variables in a small number of steps. To remove noise variables from the screening step, we further implement variable selection by adopting a modified Bayesian information criterion. We show that the smaller selected set also contains all the important variables with overwhelming probability. Using simulation designs that are intentionally chosen to show its capability in identifying jointly but not marginally important variables and detecting heterogeneous associations, we extensively investigate its finite-sample performance with regard to screening, selection and out-of-sample prediction. To further illustrate the merit of our proposal, we provide an application to the problem of identifying risk factors that are associated with childhood malnutrition in India.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call