Abstract

Gradient-based algorithms play an important role in solving a wide range of stochastic optimization problems. In recent years, implementing such schemes in parallel has become the new paradigm. In this work, we focus on the asynchronous implementation of gradient-based algorithms. In asynchronous distributed optimization, the gradient delay problem arises since optimization parameters may be updated using stale gradients. We consider a hub-and-spoke system and derive the expected gradient staleness in terms of other system parameters such as the number of nodes, communication delay, and the expected compute time. Our derivations provide a means to compare different algorithms based on the expected gradient staleness they suffer from.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.