A Novel Learning Rate Schedule in Optimization for Neural Networks and It’s Convergence

Jieun Park,Sangmin Ji,Dokkyun Yi

doi:10.3390/sym12040660

Jieun Park, Sangmin Ji + Show 1 more

Open Access

https://doi.org/10.3390/sym12040660

Copy DOI

Journal: Symmetry	Publication Date: Apr 22, 2020
Citations: 25	License type: CC BY 4.0

Affiliation: Daegu University, Chungnam National University

Abstract

The process of machine learning is to find parameters that minimize the cost function constructed by learning the data. This is called optimization and the parameters at that time are called the optimal parameters in neural networks. In the process of finding the optimization, there were attempts to solve the symmetric optimization or initialize the parameters symmetrically. Furthermore, in order to obtain the optimal parameters, the existing methods have used methods in which the learning rate is decreased over the iteration time or is changed according to a certain ratio. These methods are a monotonically decreasing method at a constant rate according to the iteration time. Our idea is to make the learning rate changeable unlike the monotonically decreasing method. We introduce a method to find the optimal parameters which adaptively changes the learning rate according to the value of the cost function. Therefore, when the cost function is optimized, the learning is complete and the optimal parameters are obtained. This paper proves that the method ensures convergence to the optimal parameters. This means that our method achieves a minimum of the cost function (or effective learning). Numerical experiments demonstrate that learning is good effective when using the proposed learning rate schedule in various situations.

Highlights

Machine learning is carried out by using a cost function to determine how accurately a model learns from data and determining the parameters that minimize this cost function
This paper proves its convergence when used with the Adam method
In order to solve this problems, the existing methods used a monotonically decreasing learning rate at a constant rate according to the iteration time

Summary

Introduction

Machine learning is carried out by using a cost function to determine how accurately a model learns from data and determining the parameters that minimize this cost function. With larger sets of training data and more complex training models, the cost function many have many local minima, and the simple gradient descent method fails at a local minimum because the gradient vanishes at this point To solve this problem, some gradient-based methods where learning occurs even when the gradient is zero have been introduced, such as momentum-based methods. Since the learning rate set initially is a constant, the gradient may not be scaled during learning To solve this problem, other methods have been developed to schedule the learning rate, such as step-based and time-based methods, where it is not a constant but a function which becomes smaller as learning progresses [11,12,13,14]. Weber’s function experiments, to test for changes in multidimensional space and local minima, binary classification experiments, and classification experiments with several classes [28]

Machine Learning Method

Direction Method

Gradient Descent Method

Momentum Method

Learning Rate Schedule

Time-Based Learning Rate Schedule

Step-Based Learning Rate Schedule

Exponential-Based Learning Rate Schedule

Adaptive Optimization Methods

The Proposed Method

Numerical Tests

Two-Variable Function Test Using Weber’s Function

Case 1

Case 2

MNIST with MLP

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Learning Rate Schedule in Optimization for Neural Networks and It’s Convergence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Performance Enhancement of Adaptive Neural Networks Based on燣earning燫ate
Swaleha Zubair ... Nitish Pathak
Computers, materials & continua | VOL. 74
Swaleha Zubair, et. al.Swaleha Zubair ... Nitish Pathak
01 Jan 2023
Computers, materials & continua | VOL. 74

Neural Network and Cuckoo Optimization Algorithm for Remote Sensing Image Classification
Vignesh Janarthanan* ... A Viswanathan
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8
Vignesh Janarthanan*, et. al.Vignesh Janarthanan* ... A Viswanathan
30 Nov 2019
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8

Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review.
Mehrdad Kaveh ... Mohammad Saadi Mesgari
Neural Processing Letters | VOL. 55
Mehrdad Kaveh, et. al.Mehrdad Kaveh ... Mohammad Saadi Mesgari
31 Oct 2022
Neural Processing Letters | VOL. 55

A Novel Algorithm for Neural Network Architecture Generation, Parameter Optimization and Feature Selection
Ricardo Brito ... Suash Deb
-
Ricardo Brito, et. al.Ricardo Brito ... Suash Deb
24 Mar 2018
24 Mar 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Learning Rate Schedule in Optimization for Neural Networks and It’s Convergence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry