Abstract

Deep learning applications require global optimization of non-convex objective functions, which have multiple local minima. The same problem is often found in physical simulations and may be resolved by the methods of Langevin dynamics with Simulated Annealing, which is a well-established approach for minimization of many-particle potentials. This analogy provides useful insights for non-convex stochastic optimization in machine learning. Here we find that integration of the discretized Langevin equation gives a coordinate updating rule equivalent to the famous Momentum optimization algorithm. As a main result, we show that a gradual decrease of the momentum coefficient from the initial value close to unity until zero is equivalent to application of Simulated Annealing or slow cooling, in physical terms. Making use of this novel approach, we propose CoolMomentum—a new stochastic optimization method. Applying Coolmomentum to optimization of Resnet-20 on Cifar-10 dataset and Efficientnet-B0 on Imagenet, we demonstrate that it is able to achieve high accuracies.

Highlights

  • Deep learning applications require global optimization of non-convex objective functions, which have multiple local minima

  • It is shown that several optimization algorithms, e.g stochastic gradient descent (SGD) with m­ omentum3, ­Adagrad4, ­RMSProp5, ­Adadelta[6] and A­ dam[7] are efficient for training artificial neural networks and optimization of nonconvex objective f­ unctions[8,9] In nonconvex setting, the objective function has multiple local minima and the efficient algorithms rely on the “hill climbing” heuristics

  • We propose to adapt the methods of Langevin dynamics to the problems of nonconvex optimization, that appear in machine learning

Read more

Summary

Introduction

Deep learning applications require global optimization of non-convex objective functions, which have multiple local minima. Training of machine learning models is performed by finding such values of their parameters that optimize an objective function.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.