General Activation Functions Research Articles

In this paper, we provide explicit upper bounds on some distances between the (law of the) output of a random Gaussian neural network and (the law of) a random Gaussian vector. Our main results concern deep random Gaussian neural networks with a rather general activation function. The upper bounds show how the widths of the layers, the activation function, and other architecture parameters affect the Gaussian approximation of the output. Our techniques, relying on Stein’s method and integration by parts formulas for the Gaussian law, yield estimates on distances that are indeed integral probability metrics and include the convex distance. This latter metric is defined by testing against indicator functions of measurable convex sets and so allows for accurate estimates of the probability that the output is localized in some region of the space, which is an aspect of a significant interest both from a practitioner’s and a theorist’s perspective. We illustrated our results by some numerical examples. Funding: This research was supported by the European Union’s Horizon 2020 research project WARIFA under grant agreement no. 101017385, by the PRIN project 2022 “Variational Analysis of Complex Systems in Materials Science, Physics and Biology” (CUP B53D23009290006), and by the INdAM project “Modelli ed Algoritmi per dati ad elevata dimensionalità” (CUP E53C23001670001).

Read full abstract

The world today has made prescriptive analytics that uses data-driven insights to guide future actions. The distribution of data, however, differs depending on the scenario, making it difficult to interpret and comprehend the data efficiently. Different neural network models are used to solve this, taking inspiration from the complex network architecture in the human brain. The activation function is crucial in introducing non-linearity to process data gradients effectively. Although popular activation functions such as ReLU, Sigmoid, Swish, and Tanh have advantages and disadvantages, they may struggle to adapt to diverse data characteristics. A generalized activation function named the Generalized Exponential Parametric Activation Function (GEPAF) is proposed to address this issue. This function consists of three parameters expressed: α, which stands for a differencing factor similar to the mean; σ, which stands for a variance to control distribution spread; and p, which is a power factor that improves flexibility; all these parameters are present in the exponent. When p=2, the activation function resembles a Gaussian function. Initially, this paper describes the mathematical derivation and validation of the properties of this function mathematically and graphically. After this, the GEPAF function is practically implemented in real-world supply chain datasets. One dataset features a small sample size but exhibits high variance, while the other shows significant variance with a moderate amount of data. An LSTM network processes the dataset for sales and profit prediction. The suggested function performs better than popular activation functions when a comparative analysis of the activation function is performed, showing at least 30% improvement in regression evaluation metrics and better loss decay characteristics.

Read full abstract

General Activation Functions Research Articles

Articles published on General Activation Functions

Multistability of recurrent neural networks with general periodic activation functions and unbounded time-varying delays

Normal Approximation of Random Gaussian Neural Networks

GEPAF: A non-monotonic generalized activation function in neural network for improving prediction with diverse data distributions characteristics

μ-stability and instability of multiple equilibrium points in delayed neural networks with general discontinuous activation functions

Multistability of Almost Periodic Solutions for Fuzzy Competitive NNs With Time-Varying Delays.

Stability Analysis of Quaternion-Valued Neutral Neural Networks with Generalized Activation Functions

Training Provably Robust Models by Polyhedral Envelope Regularization.

Latent code-based fusion: A Volterra neural network approach

Finite-Time and Fixed-Time Synchronization of Quaternion-Valued Neural Networks With/Without Mixed Delays: An Improved One-Norm Method.

Synchronization of Uncertain Neural Networks with Additive Time-Varying Delays and General Activation Function

A novel finite-time complex-valued zeoring neural network for solving time-varying complex-valued Sylvester equation

Finite-time stabilization of quaternion-valued neural networks with time delays: An implicit function method

Extended analysis on the global Mittag-Leffler synchronization problem for fractional-order octonion-valued BAM neural networks

An ESETM based robust synchronizing control on master-slave neural network with multiple time-varying delays

Global [formula omitted]-stabilization of quaternion-valued inertial BAM neural networks with time-varying delays via time-delayed impulsive control

Noise-Tolerant Zeroing Neural Dynamics for Solving Hybrid Multilayered Time-Varying Linear Equation System

Information theoretic limits of learning a sparse rule

Discovering Parametric Activation Functions

Novel trial functions and rogue waves of generalized breaking soliton equation via bilinear neural network method

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

General Activation Functions Research Articles

Articles published on General Activation Functions

Multistability of recurrent neural networks with general periodic activation functions and unbounded time-varying delays

Normal Approximation of Random Gaussian Neural Networks

GEPAF: A non-monotonic generalized activation function in neural network for improving prediction with diverse data distributions characteristics

μ-stability and instability of multiple equilibrium points in delayed neural networks with general discontinuous activation functions

Multistability of Almost Periodic Solutions for Fuzzy Competitive NNs With Time-Varying Delays.

Stability Analysis of Quaternion-Valued Neutral Neural Networks with Generalized Activation Functions

Training Provably Robust Models by Polyhedral Envelope Regularization.

Latent code-based fusion: A Volterra neural network approach

Finite-Time and Fixed-Time Synchronization of Quaternion-Valued Neural Networks With/Without Mixed Delays: An Improved One-Norm Method.

Synchronization of Uncertain Neural Networks with Additive Time-Varying Delays and General Activation Function

A novel finite-time complex-valued zeoring neural network for solving time-varying complex-valued Sylvester equation

Finite-time stabilization of quaternion-valued neural networks with time delays: An implicit function method

Extended analysis on the global Mittag-Leffler synchronization problem for fractional-order octonion-valued BAM neural networks

An ESETM based robust synchronizing control on master-slave neural network with multiple time-varying delays

Global [formula omitted]-stabilization of quaternion-valued inertial BAM neural networks with time-varying delays via time-delayed impulsive control

Noise-Tolerant Zeroing Neural Dynamics for Solving Hybrid Multilayered Time-Varying Linear Equation System

Information theoretic limits of learning a sparse rule

Discovering Parametric Activation Functions

Novel trial functions and rogue waves of generalized breaking soliton equation via bilinear neural network method