Comparison of an effectiveness of artificial neural networks for various activation functions

Daniel Florek,Marek Miłosz

doi:10.35784/jcsi.3069

Abstract

Activation functions play an important role in artificial neural networks (ANNs) because they break the linearity in the data transformations that are performed by models. Thanks to the recent spike in interest around the topic of ANNs, new improvements to activation functions are emerging. The paper presents the results of research on the effectiveness of ANNs for ReLU, Leaky ReLU, ELU, and Swish activation functions. Four different data sets, and three different network architectures were used. Results show that Leaky ReLU, ELU and Swish functions work better in deep and more complex architectures which are to alleviate vanishing gradient and dead neurons problems but at the cost of prediction speed. Swish function seems to speed up training process considerably but neither of the three aforementioned functions comes ahead in accuracy in all used datasets.

Full Text