DPReLU: Dynamic Parametric Rectified Linear Unit and Its Proper Weight Initialization Method

Donghun Yang,Kien Mai Ngoc,Myunggwon Hwang,Iksoo Shin

doi:10.1007/s44196-023-00186-w

Abstract

Activation functions are essential in deep learning, and the rectified linear unit (ReLU) is the most widely used activation function to solve the vanishing gradient problem. However, owing to the dying ReLU problem and bias shift effect, deep learning models using ReLU cannot exploit the potential benefits of negative values. Numerous ReLU variants have been proposed to address this issue. In this study, we propose Dynamic Parametric ReLU (DPReLU), which can dynamically control the overall functional shape of ReLU with four learnable parameters. The parameters of DPReLU are determined by training rather than by humans, thereby making the formulation more suitable and flexible for each model and dataset. Furthermore, we propose an appropriate and robust weight initialization method for DPReLU. To evaluate DPReLU and its weight initialization method, we performed two experiments on various image datasets: one using an autoencoder for image generation and the other using the ResNet50 for image classification. The results show that DPReLU and our weight initialization method provide faster convergence and better accuracy than the original ReLU and the previous ReLU variants.

Full Text