A Smoothing Regularizer for Feedforward and Recurrent Neural Networks

Lizhong Wu,John Moody

doi:10.1162/neco.1996.8.3.461

Abstract

We derive a smoothing regularizer for dynamic network models by requiring robustness in prediction performance to perturbations of the training data. The regularizer can be viewed as a generalization of the first-order Tikhonov stabilizer to dynamic models. For two layer networks with recurrent connections described by the training criterion with the regularizer is where Φ = {U, V, W} is the network parameter set, Z(t) are the targets, I(t) = {X(s), s = 1,2, …, t} represents the current and all historical input information, N is the size of the training data set, [Formula: see text] is the regularizer, and λ is a regularization parameter. The closed-form expression for the regularizer for time-lagged recurrent networks is where ‖ ‖ is the Euclidean matrix norm and γ is a factor that depends upon the maximal value of the first derivatives of the internal unit activations f(). Simplifications of the regularizer are obtained for simultaneous recurrent nets (τ ↦ 0), two-layer feedforward nets, and one layer linear nets. We have successfully tested this regularizer in a number of case studies and found that it performs better than standard quadratic weight decay.

Full Text