Abstract

Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are nonignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by nonignorable missingness. All the technical proofs of the theorems are given in the Appendix .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call