Optimal subsampling for multiplicative regression with massive data

Tianzhen Wang,Haixiang Zhang

doi:10.1111/stan.12266

Abstract

Faced with massive data, subsampling is a popular way to downsize the data volume for reducing computational burden. The key idea of subsampling is to perform statistical analysis on a representative subsample drawn from the full data. It provides a practical solution to extracting useful information from big data. In this article, we develop an efficient subsampling method for large‐scale multiplicative regression model, which can largely reduce the computational burden due to massive data. Under some regularity conditions, we establish consistency and asymptotic normality of the subsample‐based estimator, and derive the optimal subsampling probabilities according to the L‐optimality criterion. A two‐step algorithm is developed to approximate the optimal subsampling procedure. Meanwhile, the convergence rate and asymptotic normality of the two‐step subsample estimator are established. Numerical studies and two real data applications are carried out to evaluate the performance of our subsampling method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimal subsampling for multiplicative regression with massive data

Abstract

Talk to us

Similar Papers

More From: Statistica Neerlandica

Lead the way for us

Journal: Statistica Neerlandica	Publication Date: Mar 21, 2022
Citations: 4

Similar Papers

Optimal subsampling algorithms for composite quantile regression in massive data
Jun Jin ... Tiefeng Ma
Statistics | VOL. 57
Jun Jin, et. al.Jun Jin ... Tiefeng Ma
04 Jul 2023
Statistics | VOL. 57

Distributed optimal subsampling for quantile regression with massive data
Yue Chao ... Boya Zhu
Journal of Statistical Planning and Inference | VOL. 233
Yue Chao, et. al.Yue Chao ... Boya Zhu
18 Apr 2024
Journal of Statistical Planning and Inference | VOL. 233

Optimal subsample selection for massive logistic regression with distributed data
Lulu Zuo ... Liuquan Sun
Computational Statistics | VOL. 36
Lulu Zuo, et. al.Lulu Zuo ... Liuquan Sun
27 Feb 2021
Computational Statistics | VOL. 36

Optimal Subsampling for Large Sample Logistic Regression
Haiying Wang ... Ping Ma
Journal of the American Statistical Association | VOL. 113
Haiying Wang, et. al.Haiying Wang ... Ping Ma
03 Apr 2018
Journal of the American Statistical Association | VOL. 113

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal subsampling for multiplicative regression with massive data

Abstract

Talk to us

Similar Papers

More From: Statistica Neerlandica