An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning

Dokkyun Yi,Sunyoung Bu,Sangmin Ji

doi:10.3390/sym11070942

Dokkyun Yi, Sunyoung Bu + Show 1 more

Open Access

https://doi.org/10.3390/sym11070942

Copy DOI

Abstract

A The learning process of machine learning consists of finding values of unknown weights in a cost function by minimizing the cost function based on learning data. However, since the cost function is not convex, it is conundrum to find the minimum value of the cost function. The existing methods used to find the minimum values usually use the first derivative of the cost function. When even the local minimum (but not a global minimum) is reached, since the first derivative of the cost function becomes zero, the methods give the local minimum values, so that the desired global minimum cannot be found. To overcome this problem, in this paper we modified one of the existing schemes—the adaptive momentum estimation scheme—by adding a new term, so that it can prevent the new optimizer from staying at local minimum. The convergence condition for the proposed scheme and the convergence value are also analyzed, and further explained through several numerical experiments whose cost function is non-convex.

Highlights

Deep learning is a part of a broader family of machine learning methods [1–10] based on learning data representations, as opposed to task-specific algorithms
We introduced an enhanced optimization scheme based on the popular optimization scheme, adaptive momentum estimation (Adam), for non-convex problems induced from the machine learning process
Most existing optimizers may stay at a local minimum for non-convex problems when they meet the local minimum before meeting a global minimum

Summary

Introduction

Deep learning is a part of a broader family of machine learning methods [1–10] based on learning data representations, as opposed to task-specific algorithms. A machine will find appropriate weight values of data by introducing a cost function. There are several optimization schemes [11–25] which can be used to find the weights by minimizing the cost function, such as the Gradient Descent method (GD) [26]. The adaptive momentum estimation (Adam) scheme [27,28] is the most popular scheme based on the GD. The Adam is constructed by computing individual adaptive learning rates for different parameters from estimates of first and second moments of the gradients. The Adam method has been widely used, and it is well-known that it is easy to implement, computationally efficient, and works quite well in most cases

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Jul 20, 2019
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

An Effective Optimization Method for Machine Learning Based on ADAM
Dokkyun Yi ... Jaehyun Ahn
Applied Sciences | VOL. 10
Dokkyun Yi, et. al.Dokkyun Yi ... Jaehyun Ahn
05 Feb 2020
Applied Sciences | VOL. 10

An Adaptive Optimization Method Based on Learning Rate Schedule for Neural Networks
Dokkyun Yi ... Sangmin Ji
Applied Sciences | VOL. 11
Dokkyun Yi, et. al.Dokkyun Yi ... Sangmin Ji
18 Jan 2021
Applied Sciences | VOL. 11

Local minima-free adaptive tdoa estimation for acoustic sources in presence of high spatial aliasing
Francesco Nesta ... Trausti Thormundsson
-
Francesco Nesta, et. al.Francesco Nesta ... Trausti Thormundsson
01 Oct 2015
01 Oct 2015

Every Local Minimum Value Is the Global Minimum Value of Induced Model in Nonconvex Machine Learning.
Kenji Kawaguchi ... Leslie Pack Kaelbling
Neural computation | VOL. 31
Kenji Kawaguchi, et. al.Kenji Kawaguchi ... Leslie Pack Kaelbling
15 Oct 2019
Neural computation | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry