Abstract
In neural networks, a vital component in the learning and inference process is the activation function. There are many different approaches, but only nonlinear activation functions allow such networks to compute non-trivial problems by using only a small number of nodes, and such activation functions are called nonlinearities. With the emergence of deep learning, the need for competent activation functions that can enable or expedite learning in deeper layers has emerged. In this paper, we propose a novel activation function, combining many features of successful activation functions, achieving 2.53% higher accuracy than the industry standard ReLU in a variety of test cases.
Highlights
Activation functions originated from the attempt to generalize a linear discriminant function in order to address nonlinear classification problems in pattern recognition
We consider the accuracy achieved by rectified linear unit (ReLU) as the baseline result, and we calculate normalized accuracy as the ratio of the new activation function accuracy over the accuracy achieved by ReLU
All activation functions perform well, with the LeLeLU giving a small boost of 0.23% over the baseline ReLU
Summary
The parameter α is learnable per filter during training, and during testing, we observed a correlation between dataset complexity, depth-wise position of respective filter in the neural network topology and training phase The strong point of the proposed activation function is that the learnable parameter influences both the negative and the positive values This implies that the adaptation of α can accelerate training in certain parts of the network during certain epochs of the training procedure, when α gets values that are larger than 1. The adaptation of the parameter α is investigated in more detail
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have