Distributed optimization with arbitrary local solvers

Chenxin Ma,Jakub Konečný,Martin Jaggi,Virginia Smith,Michael I Jordan,Peter Richtárik,Martin Takáč

doi:10.1080/10556788.2016.1278445

Abstract

With the growth of data and necessity for distributed optimization methods, solvers that work well on a single machine must be re-designed to leverage distributed computation. Recent work in this area has been limited by focusing heavily on developing highly specific methods for the distributed environment. These special-purpose methods are often unable to fully leverage the competitive performance of their well-tuned and customized single machine counterparts. Further, they are unable to easily integrate improvements that continue to be made to single machine methods. To this end, we present a framework for distributed optimization that both allows the flexibility of arbitrary solvers to be used on each (single) machine locally and yet maintains competitive performance against other state-of-the-art special-purpose distributed methods. We give strong primal–dual convergence rate guarantees for our framework that hold for arbitrary local solvers. We demonstrate the impact of local solver selection both theoretically and in an extensive experimental comparison. Finally, we provide thorough implementation details for our framework, highlighting areas for practical performance gains.

Highlights

Background and problem formulationTo provide context for our framework, we first state traditional complexity measures and convergence rates for single machine algorithms, and demonstrate that these must be adapted to more accurately represent the performance of an algorithm in the distributed setting.When running an iterative optimization algorithm A on a single machine, its performance is typically measured by the total runtime: TIME(A) = IA(ε) × TA. (T-A)Here, TA stands for the time it takes to perform a single iteration of algorithm A, and IA(ε) is the number of iterations A needs to attain an ε-accurate objective.1On a single machine, most of the state-of-the-art first-order optimization methods can achieve quick convergence in practice in terms of (T-A) by performing a large amount of relatively fast iterations
We review a number of methods designed to solve optimization problems of the form of interest here, which are typically referred to as regularized empirical risk minimization (ERM) problems in the machine learning literature
The known convergence rates for alternating direction method of multipliers (ADMM) are weaker than the more problem-tailored methods mentioned we study here, and the choice of the penalty parameter is often unclear in practice

Summary

Introduction

To provide context for our framework, we first state traditional complexity measures and convergence rates for single machine algorithms, and demonstrate that these must be adapted to more accurately represent the performance of an algorithm in the distributed setting. When running an iterative optimization algorithm A on a single machine, its performance is typically measured by the total runtime: TIME(A) = IA(ε) × TA. Most of the state-of-the-art first-order optimization methods can achieve quick convergence in practice in terms of (T-A) by performing a large amount of relatively fast iterations. Time to communicate between two machines can be several orders of magnitude slower than even a single iteration of such an algorithm. Distributed timing can be more accurately illustrated using the following practical distributed efficiency model (see [37]), where

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Optimization Methods and Software	Publication Date: Feb 1, 2017
Citations: 166	License type: open-access

R Discovery Prime

R Discovery Prime

Distributed optimization with arbitrary local solvers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Optimization Methods and Software

Lead the way for us

Similar Papers

NXgraph: An efficient graph processing system on a single machine
Yuze Chi ... Guoliang Li
-
Yuze Chi, et. al.Yuze Chi ... Guoliang Li
01 May 2016
01 May 2016

Competitive Analysis of Task Scheduling Algorithms on a Fault-Prone Machine and the Impact of Resource Augmentation
Antonio Fernández Anta ... Chryssis Georgiou
-
Antonio Fernández Anta, et. al.Antonio Fernández Anta ... Chryssis Georgiou
01 Jan 2015
01 Jan 2015

Competitive analysis of fundamental scheduling algorithms on a fault-prone machine and the impact of resource augmentation
Antonio Fernández Anta ... Elli Zavou
Future Generation Computer Systems | VOL. 78
Antonio Fernández Anta, et. al.Antonio Fernández Anta ... Elli Zavou
06 Jun 2016
Future Generation Computer Systems | VOL. 78

Performance of Workstation With Offline and Integrated Metrology
A J De Ron ... J E Rooda
IEEE Transactions on Semiconductor Manufacturing | VOL. 20
A J De Ron, et. al.A J De Ron ... J E Rooda
01 May 2007
IEEE Transactions on Semiconductor Manufacturing | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed optimization with arbitrary local solvers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Optimization Methods and Software