Mathematical Neural Networks

Julia García Cabello

doi:10.3390/axioms11020080

Abstract

ANNs succeed in several tasks for real scenarios due to their high learning abilities. This paper focuses on theoretical aspects of ANNs to enhance the capacity of implementing those modifications that make ANNs absorb the defining features of each scenario. This work may be also encompassed within the trend devoted to providing mathematical explanations of ANN performance, with special attention to activation functions. The base algorithm has been mathematically decoded to analyse the required features of activation functions regarding their impact on the training process and on the applicability of the Universal Approximation Theorem. Particularly, significant new results to identify those activation functions which undergo some usual failings (gradient preserving) are presented here. This is the first paper—to the best of the author’s knowledge—that stresses the role of injectivity for activation functions, which has received scant attention in literature but has great incidence on the ANN performance. In this line, a characterization of injective activation functions has been provided related to monotonic functions which satisfy the classical contractive condition as a particular case of Lipschitz functions. A summary table on these is also provided, targeted at documenting how to select the best activation function for each situation.

Highlights

Forecasting is one of the greatest successes of human beings
While a great deal of research aimed at ensuring the stability of learning processes proposes to bound the variables, some authors [11] claim the significance of using bounded activation functions in order to avoid instability
As for injectivity regarding the Universal Approximation Theorem (UAT), we refer to work [21], where a particular case of continuous activation functions φ are introduced, those which satisfy any of the following equivalent conditions: φ is injective and has no fixed points ⇔ either φ(z) > z or φ(z) < z holds for every z ∈ Dom( φ)

Summary

Introduction

Forecasting is one of the greatest successes of human beings. This is the engine that provides solid support in decision making (DM) by simulating a future range of possibilities in order to anticipate potential problems and/or by designing tools that increase reliability of predictions. This paper may be firstly encompassed within the trend devoted to providing mathematical explanations to ANN performance An example of this trend is work regarding the Universal Approximation Theorem (UAT), which shows that any continuous function on a compact set can be approximated by a fully connected neural network with one hidden layer by using a nonpolynomial activation function. A further study of advantages and disadvantages of activation functions is performed These are decisive pieces in the success or failure of the ANNs, as we shall see when we explore their determinant features regarding the applicability of the Universal Approximation Theorem. Another reason to carry out this analysis is the enormous specific weight that the choice of the activation function has on the training process.

Mathematical Foundations

Theoretical Learning Algorithm

GDM: Gradient Descent Minimum or Cauchy Descendent

Training the ANN

The Role of the Activation Function

Derived from the Theoretical Foundations of the Training Process

Influence of Activation Functions on the Training Process

Mainly Used Activation Functions

Practical Learning Algorithm

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Axioms	Publication Date: Feb 17, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Mathematical Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Axioms

Lead the way for us

Similar Papers

Parametric RSigELU: a new trainable activation function for deep learning
Serhat Kiliçarslan ... Mete Celik
Neural Computing & Applications | VOL. 36
Serhat Kiliçarslan, et. al.Serhat Kiliçarslan ... Mete Celik
28 Feb 2024
Neural Computing & Applications | VOL. 36

Signal processing algorithm for neural networks with integrodifferential splines as an activation function and its particular case of image classification
T.K Biryukova
Highly available systems | VOL. -
T.K BiryukovaT.K Biryukova
01 Jan 2020
Highly available systems | VOL. -

A novel type of activation function in artificial neural networks: Trained activation function
Ömer Faruk Ertuğrul
Neural networks : the official journal of the International Neural Network Society | VOL. 99
Ömer Faruk ErtuğrulÖmer Faruk Ertuğrul
31 Jan 2018
Neural networks : the official journal of the International Neural Network Society | VOL. 99

Review of Adaptive Activation Function in Deep Neural Network
Mian Mian Lau ... King Hann Lim
-
Mian Mian Lau, et. al.Mian Mian Lau ... King Hann Lim
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mathematical Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Axioms