Optimal subsampling for linear quantile regression models

Yan Fan,Yukun Liu,Lixing Zhu

doi:10.1002/cjs.11590

Abstract

AbstractSubsampling techniques are efficient methods for handling big data. Quite a few optimal sampling methods have been developed for parametric models in which the loss functions are differentiable with respect to parameters. However, they do not apply to quantile regression (QR) models as the involved check function is not differentiable. To circumvent the non‐differentiability problem, we consider directly estimating the linear QR coefficient by minimizing the Hansen–Hurwitz estimator of the usual loss function for QR. We establish the asymptotic normality of the resulting estimator under a generic sampling method, and then develop optimal subsampling methods for linear QR. In particular, we propose a one‐stage subsampling method, which depends only on the lengths of covariates, and a two‐stage subsampling method, which is a combination of the one‐stage sampling and the ideal optimal subsampling methods. Our simulation and real data based simulation studies show that the two recommended sampling methods always outperform simple random sampling in terms of mean square error, whether the linear QR model is valid or not.

Full Text