Large-Scale Expectile Regression With Covariates Missing at Random

Yingli Pan,Wen Cai,Zhan Liu

doi:10.1109/access.2020.2970741

Abstract

Analysis of large volumes of data is very complex due to not only a high level of skewness and heteroscedasticity of variance but also the phenomenon of missing data. Expectile regression is a popular alternative method of analyzing heterogeneous data. In this paper, we consider fitting a linear expectile regression model for estimating conditional expectiles based on a large quantity of data with covariates missing at random. We construct a communication-efficient surrogate loss (CSL) function to estimate model parameters. The asymptotic normality of the proposed estimator is established. A proximal alternating direction method of multipliers (ADMM) algorithm is developed for distributed statistical optimization on a large quantity of data. Simulation studies are performed to assess the finite-sample performance of the proposed method. Survey data from the Behavioral Risk Factor Surveillance System (BRFSS) is used to demonstrate the utility of the proposed method in practice.

Highlights

Large-scale data, which arise in many fields such as online surveys, genomics and economics, are characterized by a high level of skewness, heteroscedasticity of variance and the phenomenon of missing information
We study a distributed optimization approach to analyzing large-scale data based on expectile regression with covariates missing at random
We study an efficient approach in an expectile regression framework for analyzing large-scale data with covariates missing at random

Summary

INTRODUCTION

Large-scale data, which arise in many fields such as online surveys, genomics and economics, are characterized by a high level of skewness, heteroscedasticity of variance and the phenomenon of missing information. Y. Pan et al.: Large-Scale Expectile Regression With Covariates Missing at Random performed [16], [25]. We study a distributed optimization approach to analyzing large-scale data based on expectile regression with covariates missing at random. The CSL function can be regarded as a communication-efficient surrogate for the weighted global loss function, and can effectively solve the problems caused by large-scale data stored randomly on multiple machines. To establish the asymptotic properties of the proposed estimator, we apply the distributed optimization theory [10] and the Lindeberg-Feller central limit theorem Another challenge arises from the numerical calculation of the proposed estimator. The proofs of asymptotic properties are given in the Appendix

DESIGN AND ESTIMATION

ASYMPTOTIC PROPERTIES

PROXIMAL ADMM ALGORITHM

SIMULATION STUDIES

12: Update

DISCUSSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Large-Scale Expectile Regression With Covariates Missing at Random

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Distributed estimation for large-scale expectile regression
Yingli Pan ... Zhan Liu
Communications in Statistics - Simulation and Computation | VOL. ahead-of-print
Yingli Pan, et. al.Yingli Pan ... Zhan Liu
07 Aug 2023
Communications in Statistics - Simulation and Computation | VOL. ahead-of-print

Distributed optimization and statistical learning for large-scale penalized expectile regression
Yingli Pan
Journal of the Korean Statistical Society | VOL. 50
Yingli PanYingli Pan
09 Jun 2020
Journal of the Korean Statistical Society | VOL. 50

High-dimensional expectile regression incorporating graphical structure among predictors
Yingli Pan ... Zhan Liu
Journal of Statistical Computation and Simulation | VOL. ahead-of-print
Yingli Pan, et. al.Yingli Pan ... Zhan Liu
21 Jul 2022
Journal of Statistical Computation and Simulation | VOL. ahead-of-print

COPD Surveillance—United States, 1999-2011
Earl S Ford ... Wayne H Giles
Chest | VOL. 144
Earl S Ford, et. al.Earl S Ford ... Wayne H Giles
25 Apr 2013
COPD Surveillance—United States, 1999-2011
Earl S Ford ... Wayne H Giles

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Large-Scale Expectile Regression With Covariates Missing at Random

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access