A new approach for the vanishing gradient problem on sigmoid activation

Matías Roodschild,Jorge Gotay Sardiñas,Adrián Will

doi:10.1007/s13748-020-00218-y

Matías Roodschild, Jorge Gotay Sardiñas + Show 1 more

https://doi.org/10.1007/s13748-020-00218-y

Copy DOI

Abstract

The vanishing gradient problem (VGP) is an important issue at training time on multilayer neural networks using the backpropagation algorithm. This problem is worse when sigmoid transfer functions are used, in a network with many hidden layers. However, the sigmoid function is very important in several architectures such as recurrent neural networks and autoencoders, where the VGP might also appear. In this article, we propose a modification of the backpropagation algorithm for the sigmoid neurons training. It consists of adding a small constant to the calculation of the sigmoid’s derivative so that the proposed training direction differs slightly from the gradient while keeping the original sigmoid function in the network. This approach suggests that the derivative’s modification produces the same accuracy in fewer training steps on most datasets. Moreover, due to VGP, the original derivative does not converge using sigmoid functions on more than five hidden layers. However, the modification allows backpropagation to train two extra hidden layers in feedforward neural networks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A new approach for the vanishing gradient problem on sigmoid activation

Abstract

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence

Lead the way for us

Journal: Progress in Artificial Intelligence	Publication Date: Oct 20, 2020
Citations: 58

Similar Papers

Elman Neural Networks In Model Predictive Control
David Samek
-
David SamekDavid Samek
09 Jun 2009
09 Jun 2009

Estimation of Medicine Amount Used Anesthesia by an Artificial Neural Network
Rüştü Güntürkün
Journal of Medical Systems | VOL. 34
Rüştü GüntürkünRüştü Güntürkün
12 May 2009
Journal of Medical Systems | VOL. 34

Multilayer feedforward neural networks: a canonical form approximation of nonlinearity
Zhenni Wang ... A Julian Morris
International Journal of Control | VOL. 56
Zhenni Wang, et. al.Zhenni Wang ... A Julian Morris
01 Sep 1992
International Journal of Control | VOL. 56

Comparison of neural based multiuser detection techniques for SDMA based wireless communication system
Bagadi Kala Praveen ... Susmita Das
-
Bagadi Kala Praveen, et. al.Bagadi Kala Praveen ... Susmita Das
01 Mar 2012
01 Mar 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new approach for the vanishing gradient problem on sigmoid activation

Abstract

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence