In deep learning models, the inputs to the network are processed using activation functions to generate the output corresponding to these inputs. Deep learning models are of particular importance in analyzing big data with numerous parameters and forecasting and are useful for image processing, natural language processing, object recognition, and financial forecasting. Sigmoid and tangent activation functions, which are traditional activation functions, are widely used in deep learning models. However, the sigmoid and tangent activation functions face the vanishing gradient problem. In order to overcome this problem, the ReLU activation function and its derivatives were proposed in the literature. However, there is a negative region problem in these activation functions. In this study, novel RSigELU activation functions, such as single-parameter RSigELU (RSigELUS) and double-parameter (RSigELUD), which are a combination of ReLU, sigmoid, and ELU activation functions, were proposed. The proposed RSigELUS and RSigELUD activation functions can overcome the vanishing gradient and negative region problems and can be effective in the positive, negative, and linear activation regions. Performance evaluation of the proposed RSigELU activation functions was performed on the MNIST, Fashion MNIST, CIFAR-10, and IMDb Movie benchmark datasets. Experimental evaluations showed that the proposed activation functions perform better than other activation functions.
Read full abstract