Abstract

First-order gradient-based optimization algorithms have been of core practical importance in the field of deep learning. In this paper, we propose a new weighting mechanism-based first-order gradient descent optimization algorithm, namely NWM-Adam, to resolve the undesirable convergence behavior of some optimization algorithms which employ fixed sized window of past gradients to scale the gradient updates and improve the performance of Adam and AMSGrad. The NWM-Adam is developed on the basis of the idea, i.e., placing more memory of the past gradients than the recent gradients. Furthermore, it can easily adjust the degree to which how much the past gradients weigh in the estimation. In order to empirically test the performance of our proposed NWM-Adam optimization algorithm, we compare it with other popular optimization algorithms in three well-known machine learning models, i.e., logistic regression, multi-layer fully connected neural networks, and deep convolutional neural networks. The experimental results show that the NWM-Adam can outperform other optimization algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call