Abstract

This paper proposes a flexible probabilistic activation function that enhances the training and operation of artificial neural networks by intentionally injecting noise to gain additional control over the response of each neuron. During the learning phase, the level of injected noise is iteratively optimized by gradient-descent, realizing a form of adaptive stochastic resonance. From simple hard-threshold non-differentiable neuronal responses, controlled injection of noise gives access to a wide range of useful activation functions, with sufficient differentiability to enable gradient-descent learning for both the neuron and the injected-noise levels. Experimental results on function approximation demonstrate injected noise generally converging to non-vanishing optimal levels associated with improved generalization abilities in the neural networks. A theoretical explanation of the generalization improvement based on the path norm bound is presented. With injected noise in the deep neural network, experimental results on classifying images also obtain non-vanishing optimal noise levels to achieve better testing accuracies. The proposed probabilistic activation functions show the potential of adaptive stochastic resonance for useful applications in machine learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call