Delay-tolerant distributed Bregman proximal algorithms

S Chraibi,F Iutzeler,J Malick,A Rogozin

doi:10.1080/10556788.2023.2278089

Abstract

Many problems in machine learning write as the minimization of a sum of individual loss functions over the training examples. These functions are usually differentiable but, in some cases, their gradients are not Lipschitz continuous, which compromises the use of (proximal) gradient algorithms. Fortunately, changing the geometry and using Bregman divergences can alleviate this issue in several applications, such as for Poisson linear inverse problems. However, the Bregman operation makes the aggregation of several points and gradients more involved, hindering the distribution of computations for such problems. In this paper, we propose an asynchronous variant of the Bregman proximal-gradient method, able to adapt to any centralized computing system. In particular, we prove that the algorithm copes with arbitrarily long delays and we illustrate its behaviour on distributed Poisson inverse problems.

Full Text