Comparative Analysis of First and Second Order Methods for Optimization in Neural Networks

Auras Khanal,Mehmet Di̇k

doi:10.47086/pims.1170457

Abstract

Artificial Neural Networks are fine tuned to yield the best performance through an iterative process where the values of their parameters are altered. Optimization is the preferred method to determine the parameters that yield the minima of the loss function, an evaluation metric for ANN’s. However, the process of finding an optimal model which has minimum loss faces several obstacles, the most notable being the efficiency and rate of convergence to the minima of the loss function. Such optimization efficiency is imperative to reduce the use of computational resources and time when training Neural Network models. This paper reviews and compares the intuition and effectiveness of existing optimization algorithms such as Gradient Descent, Gradient Descent with Momentum, RMSProp and Adam that implement first order derivatives, and Newton’s Method that utilizes second order derivatives for convergence. It also explores the possibility to combine and leverage first and second order optimization techniques for improved performance when training Artificial Neural Networks.

Full Text