Abstract

This work proposes a variant of Sequential Motion Optimization (SMO) framework called Sequential Motion Optimization with Short-term Adaptive Moment Estimation (SMO-SAdam) to optimize neural network parameters. SMO-SAdam embeds the mechanism of Adaptive Moment Estimation (Adam) into SMO to enhance its optimization effectiveness for solving large-scale problems. Particularly, first and second moments of solution candidates including the leader and the follower are calculated by a so-called short-term adaptive moment estimation and incorporated into SMO process to accelerate their parameter optimization updates. Surrogate gradients at update steps for calculating the required moments are theoretically extracted by chain rule through SMO motion chain. Numerical results on both standard and scientific deep learning (DL) benchmarks considering training and generalization performance confirm the superiority of the devised SMO-SAdam in comparison to state-of-the-art algorithms. Moreover, it is noteworthy that SMO-SAdam outstandingly solves scientific deep learning problems having highly complex loss landscapes, which indicates its great potential to be successfully employed to other applications in this field and different non-convex optimization issues as well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.