Abstract

In this paper, we present a limited-memory common-directions method for smooth optimization that interpolates between first- and second-order methods. At each iteration, a subspace of a limited dimension size is constructed using first-order information from previous iterations, and an efficient Newton method is deployed to find an approximate minimizer within this subspace. With properly selected subspace of dimension as small as two, the proposed algorithm achieves the optimal convergence rates for first-order methods while remaining a descent method, and it also possesses fast convergence speed on nonconvex problems. Since the major operations of our method are dense matrix-matrix operations, the proposed method can be efficiently parallelized in multicore environments even for sparse problems. By wisely utilizing historical information, our method is also communication-efficient in distributed optimization that uses multiple machines as the Newton steps can be calculated with little communication. Numerical study shows that our method has superior empirical performance on real-world large-scale machine learning problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.