Divide and conquer local average regression

Xiangyu Chang,Yao Wang,Shao-Bo Lin

doi:10.1214/17-ejs1265

Abstract

The divide and conquer strategy, which breaks a massive data set into a se- ries of manageable data blocks, and then combines the independent results of data blocks to obtain a final decision, has been recognized as a state-of-the-art method to overcome challenges of massive data analysis. In this paper, we merge the divide and conquer strategy with local average regression methods to infer the regressive relationship of input-output pairs from a massive data set. After theoretically analyzing the pros and cons, we find that although the divide and conquer local average regression can reach the optimal learning rate, the restric- tion to the number of data blocks is a bit strong, which makes it only feasible for small number of data blocks. We then propose two variants to lessen (or remove) this restriction. Our results show that these variants can achieve the optimal learning rate with much milder restriction (or without such restriction). Extensive experimental studies are carried out to verify our theoretical assertions.

Highlights

Divide and conquer strategies have many applicable scenarios
The average mixture has been shown to be efficient and feasible for global modeling methods such as conditional maximum entropy model [17], kernel ridge regression [29, 15, 4], kernel-based gradient descent [16] and kernel-based spectral algorithms [3, 11]. Compared with these global modeling methods, local average regression (LAR) [12, 8, 25], such as the Nadaraya-Watson kernel (NWK) and k nearest neighbor (KNN) estimates, which is by definition a learning scheme that averages outputs whose corresponding inputs satisfy certain localization assumptions, is recognized in the literature [12] to possess lower computational burden and is widely used in image processing [24], recommendation system [2] and financial engineering [13]
We show that average mixture local average regression (AVM-LAR) can achieve the optimal learning rate of LAR on the whole data set under some strong restrictions on m, the number of data blocks

Summary

Local average regression

Let DN = {(Xi, Yi)}Ni=1 be the data set where Xi ∈ X ⊆ Rd is a explanatory variable and Yi ∈ [−M, M ] is the real-valued response for some 0 < M < ∞. Its value may depend on the data and the query point x. Two widely used examples of LAR are the Nadaraya-Watson kernel (NWK) and k nearest neighbor (KNN) estimates. (NWK estimate) Let K : X → R+ be a kernel function [12], and h > 0 be its localization parameter. In the NWK estimate, the localization parameter depends only on the size of data. We denote the weight of KNN as Wh,Xi instead of Wk,Xi for the sake of unity and h = x − X(k)(x) depends on the distribution of data and the query point x

Optimal learning rate of LAR

AVM-LAR

AVM-LAR with data-dependent parameters

Qualified AVM-LAR

Simulation 1

Simulation 2

Proofs

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2017
Citations: 37	License type: cc-by

R Discovery Prime

R Discovery Prime

Divide and conquer local average regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

An Adaptive BP Algorithm with Optimal Learning Rates and Directional Error Correction for Foreign Exchange Market Trend Prediction
Lean Yu ... Shouyang Wang
-
Lean Yu, et. al.Lean Yu ... Shouyang Wang
01 Jan 2006
01 Jan 2006

Estimation of the maximum flow-mediated brachial artery response using local regression methods
M E Andrew ... J Violanti
Physiological Measurement | VOL. 28
M E Andrew, et. al.M E Andrew ... J Violanti
18 Sep 2007
Physiological Measurement | VOL. 28

Optimal Learning Rate For Training Time Lagged Recurrent Neural Networks With The Extended Kalman Filter Algorithm
Pu Sun ... K Marko
-
Pu Sun, et. al. Pu Sun ... K Marko
01 Jan 1998
01 Jan 1998

A new dynamic optimal learning rate for a two-layer neural network
Tong Zhang ... C L Philip Chen
-
Tong Zhang, et. al.Tong Zhang ... C L Philip Chen
01 Jun 2012
01 Jun 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Divide and conquer local average regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics