Abstract

Distributed second-order optimization, as an effective strategy for training large-scale machine learning systems, has been widely investigated due to its low communication complexity. However, the existing distributed second-order optimization algorithms, including distributed approximate Newton (DANE), accelerated inexact DANE (AIDE), and statistically preconditioned accelerated gradient (SPAG), are all required to precisely solve an expensive subproblem up to the target precision. Therefore, this causes these algorithms to suffer from high computation costs and this hinders their development. In this article, we design a novel distributed second-order algorithm called the accelerated distributed approximate Newton (ADAN) method to overcome the high computation costs of the existing ones. Compared with DANE, AIDE, and SPAG, which are constructed based on the relative smooth theory, ADAN's theoretical foundation is built upon the inexact Newton theory. The different theoretical foundations lead to handle the expensive subproblem efficiently, and steps required to solve the subproblem are independent of the target precision. At the same time, ADAN resorts to the acceleration and can effectively exploit the objective function's curvature information, making ADAN to achieve a low communication complexity. Thus, ADAN can achieve both the communication and computation efficiencies, while DANE, AIDE, and SPAG can achieve only the communication efficiency. Our empirical study also validates the advantages of ADAN over extant distributed second-order algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.