Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Tommi Vatanen,Harri Valpola,Tapani Raiko,Yann Lecun

doi:10.1007/978-3-642-42054-2_55

Abstract

Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Thin-film coatings design using second-order optimization methods
Alexander V Tikhonravov ... Michael K Trubetskov
-
Alexander V Tikhonravov, et. al.Alexander V Tikhonravov ... Michael K Trubetskov
04 Mar 1993
04 Mar 1993

Fast Full Waveform Inversion with Source Encoding and Second Order Optimization Methods
C Castellanos Lopez ... R Brossier
-
C Castellanos Lopez, et. al.C Castellanos Lopez ... R Brossier
01 Jan 2013
01 Jan 2013

Efficient BackProp
Yann Lecun ... Klaus -Robert Müller
-
Yann Lecun, et. al.Yann Lecun ... Klaus -Robert Müller
01 Jan 1998
01 Jan 1998

Efficient BackProp
Yann A Lecun ... Léon Bottou
-
Yann A Lecun, et. al.Yann A Lecun ... Léon Bottou
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Abstract

Talk to us

Similar Papers