Abstract
Deep neural networks (DNN) are typically optimized using stochastic gradient descent (SGD). However, the estimation of the gradient using stochastic samples tends to be noisy and unreliable, resulting in large gradient variance and bad convergence. In this paper, we propose Kalman Optimizor (KO), an efficient stochastic optimization algorithm that adopts Kalman filter to make consistent estimation of the local gradient by solving an adaptive filtering problem. Our method reduces estimation variance in stochastic gradient descent by incorporating the historic state of the optimization. It aims to improve noisy gradient direction as well as accelerate the convergence of learning. We demonstrate the effectiveness of the proposed Kalman Optimizer under various optimization tasks where it is shown to achieve superior and robust performance. The code is available at https://github.com/Adamdad/Filter-Gradient-Decent.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.