Abstract

Deep multi-layer neural networks represent hypotheses of very high degree polynomials to solve very complex problems. Gradient descent optimization algorithms are utilized to train such deep networks through backpropagation, which suffers from permanent problems such as the vanishing gradient problem. To overcome the vanishing problem, we introduce a new anti-vanishing back-propagated learning algorithm called oriented stochastic loss descent (OSLD). OSLD updates a random-initialized parameter iteratively in the opposite direction of its partial derivative sign by a small positive random number, which is scaled by a tuned ratio of the model loss. This paper compares OSLD to stochastic gradient descent algorithm as the basic backpropagation algorithm and Adam as one of the best backpropagation algorithms in five benchmark models. Experimental results show that OSLD is very competitive to Adam in small and moderate depth models, and OSLD outperforms Adam in very long models. Moreover, OSLD is compatible with current backpropagation architectures except for learning rates. Finally, OSLD is stable and opens more choices in front of the very deep multi-layer neural networks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.