Abstract
Deep Neural Networks have become the tool of choice for Machine Learning practitioners today. They have been successfully applied for solving a large class of learning problems both in the industry and academia with applications in fields such as Computer Vision, Natural Language Processing, Big data Analytics and Bioinformatics. One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers. Activation functions are used for introducing non-linearity into the neural network model so that the network can progressively learn more effective feature representations. Several different activation functions have been used in the literature. However Linear, Sigmoid, Tanh and ReLU are the most commonly used activation functions and they are often selected empirically during the network design phase, rather than through a proper data driven process. In this work we empirically study the problem of generalizing the single output ReLU activation by parameterizing the same so that data driven methods can be used to select variations of the single output ReLU. We call this class of activations the Generalized ReLU Activations. Special cases are ReLU as well as variations like the Leaky ReLU that have already been studied in the literature. We report results of extensive experiments on the well known MNIST handwriting dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.