A Formal Characterization of Activation Functions in Deep Neural Networks.

Massi Amrouche,Dusan M Stipanovic,Dušan M Stipanović

doi:10.1109/tnnls.2022.3187538

A Formal Characterization of Activation Functions in Deep Neural Networks.

Massi Amrouche, Dusan M Stipanovic + Show 1 more

Open Access

https://doi.org/10.1109/tnnls.2022.3187538

Copy DOI

Journal: IEEE transactions on neural networks and learning systems	Publication Date: Feb 1, 2024
License type: CC BY 4.0

Affiliation: University of Illinois Urbana-Champaign

#Activation Functions In Neural Networks #Activation Functions + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this article, a mathematical formulation for describing and designing activation functions in deep neural networks is provided. The methodology is based on a precise characterization of the desired activation functions that satisfy particular criteria, including circumventing vanishing or exploding gradients during training. The problem of finding desired activation functions is formulated as an infinite-dimensional optimization problem, which is later relaxed to solving a partial differential equation. Furthermore, bounds that guarantee the optimality of the designed activation function are provided. Relevant examples with some state-of-the-art activation functions are provided to illustrate the methodology.

Full Text