Abstract

In this paper, based on the Karush–Kuhn–Tucker (KKT) conditions of regularization, we propose a communication-efficient distributed learning approach for high-dimensional and sparse generalized linear models with massive data sets stored across different machines. This proposed method is a support detection and root finding method for generalized linear models in a distributed form. In each round of the proposed method, the support set is first determined by the primal and dual information reduced to the master machine, then the reduced maximum likelihood estimator is obtained by the gradient descent method, among which it only suffices to calculate the gradient vectors on each machine and communicate them instead of the data. We give the optimal -norm error bound for the sequences generated by the proposed algorithm and show that this -norm error bound decays exponentially to the optimal order. Moreover, we show that the oracle estimator can be recovered if the target signal is not less than the detectable level. In addition, an adaptive version of the proposed algorithm is developed to estimate the sparsity level. Simulation studies illustrate the superior performance of the proposed methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call