Abstract

Modern high-dimensional statistical inference often faces the problem of missing data. In recent decades, many studies have focused on this topic and provided strategies including complete-sample analysis and imputation procedures. However, complete-sample analysis discards information of incomplete samples, while imputation procedures have accumulative errors from each single imputation. In this article, we propose a new method, Sample-wise COmbined missing effect Model with penalization (SCOM), to deal with missing data occurring in predictors. Instead of imputing the predictors, SCOM estimates the combined effect caused by all missing data for each incomplete sample. SCOM makes full use of all available data. It is robust with respect to various missing mechanisms. Theoretical studies show the oracle inequality for the proposed estimator, and the consistency of variable selection and combined missing effect selection. Simulation studies and an application to the Residential Building Data also illustrate the effectiveness of the proposed SCOM. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call