Dynamic random distribution learning rate for neural networks training

Xueheng Hu,Shuhuan Wen,H.K Lam

doi:10.1016/j.asoc.2022.109058

Abstract

The learning rate is the most crucial hyper-parameter of a neural network that has a significant impact on its performance. In this article, a novel learning rate setting idea termed randomness distribution learning rate (RDLR) is presented to regulate the learning rate value. The proposed RDLR shifts the learning rate from deterministic to random and sets the value based on the state of the network. The RDLR uses the distance between neurons rather than the covariance matrix to get the redundancy of the network, as well as the Monte Carlo method, and to simplify the neuron to a point to reduce calculation costs. The proposed algorithms do not regulate the learning rate value of each epoch but rather the mathematical expectation and distribution of the learning rate during the training process. The neural network can jump out of the local minimum or unstable area using our algorithms and obtain the minimum point of the area in gradient space. The RDLR algorithms reduce the impact of tiny changes in learning rate value and streamline the tuning process of neural networks. The RDLR saves calculation costs and can work independently or cooperate with the traditional algorithms. In conjunction with traditional learning rate algorithms, the RDLR can set the same learning rate strategy for all layers in a neural network or keep the same mathematical expectation of the learning rate of each layer while adjusting their impulse. The experiments show that the RDLR can improve the performance of a neural network while keeping other hyper-parameters not changed. It is a novel method for adjusting the training process by dynamically changing the random distribution of the learning rate. Our algorithm can monitor the state of the neural network and keep injecting randomness into the neural network training based on the redundancy of the neurons. Furthermore, our algorithm does not require any additional hyper-parameters. The experiments show that our RDLR can improve the performance of multiple structure neural networks in various tasks when applied to a variety of loss functions and data augment methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dynamic random distribution learning rate for neural networks training

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Journal: Applied Soft Computing	Publication Date: May 26, 2022
Citations: 6

Similar Papers

The Effect of Learning Rate on Fractal Image Coding Using Artificial Neural Networks
Rashad A Al-Jawfi
Fractal and Fractional | VOL. 6
Rashad A Al-JawfiRashad A Al-Jawfi
23 May 2022
Fractal and Fractional | VOL. 6

SPECIFICS OF THE LEARNING ERROR DEPENDENCE OF MULTILAYERED NEURAL NETWORKS FROM THE ACTIVATION FUNCTION DURING THE PROCESS OF PRINTED DIGITS IDENTIFICATION
Serhiy Sveleba ... Ivan Kuno
Electronics and Information Technologies | VOL. 17
Serhiy Sveleba, et. al.Serhiy Sveleba ... Ivan Kuno
01 Jan 2021
Electronics and Information Technologies | VOL. 17

Performance Analysis of Learning Rate Parameter on Prediction of Signal Power Loss for Network Optimization and Better Generalization
Virginia C Ebhota ... Viranjay M Srivastava
Wireless Personal Communications | VOL. 118
Virginia C Ebhota, et. al.Virginia C Ebhota ... Viranjay M Srivastava
20 Jan 2021
Wireless Personal Communications | VOL. 118

An automatic learning rate decay strategy for stochastic gradient descent optimization methods in neural networks
Kang Wang ... Peng Qiao
International Journal of Intelligent Systems | VOL. 37
Kang Wang, et. al.Kang Wang ... Peng Qiao
31 Mar 2022
International Journal of Intelligent Systems | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic random distribution learning rate for neural networks training

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing