Improving deep neural networks with multi-layer maxout networks and a novel initialization method

Weichen Sun,Fei Su,Leiquan Wang

doi:10.1016/j.neucom.2017.05.103

Abstract

Abstract For the purpose of enhancing the discriminability of convolutional neural networks (CNNs) and facilitating the optimization, we investigate the activation function for a neural network and the corresponding initialization method in this paper. Firstly, a trainable activation function with a multi-layer structure (named “Multi-layer Maxout Network”, MMN) is proposed. MMN is a multi-layer structured maxout, inheriting advantages of both a non-saturated activation function and a trainable activation function approximator. Secondly, we derive a robust initialization method specifically for the MMN activation with a theoretical proof, which works for the maxout activation as well. Our novel initialization method could reduce internal covariate shift when signals are propagated through layers and solve the so called “exploding/vanishing gradient” problem, which leads a more efficient training procedure of deep neural networks. Experimental results show that our proposed model yields better performance on three image classification benchmark datasets (CIFAR-10, CIFAR-100 and ImageNet) than quite a few state-of-the-art methods and our novel initialization method improves performance further. Furthermore, the influence of MMN in different hidden layers is analyzed, and a trade-off scheme between the accuracy and computing resources is given.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving deep neural networks with multi-layer maxout networks and a novel initialization method

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Sep 1, 2017
Citations: 86

Similar Papers

A Two-stage Training Mechanism for the CNN with Trainable Activation Function
Kun-Chih Jimmy Chen ... Jing-Wen Liang
-
Kun-Chih Jimmy Chen, et. al.Kun-Chih Jimmy Chen ... Jing-Wen Liang
21 Oct 2020
21 Oct 2020

Research on improved convolutional wavelet neural network
Jingwei Liu ... Jiaxin Li
Scientific Reports | VOL. 11
Jingwei Liu, et. al.Jingwei Liu ... Jiaxin Li
09 Sep 2021
Scientific Reports | VOL. 11

Web-aided data set expansion in deep learning: evaluating trainable activation functions in ResNet for improved image classification
Zhiqiang Zhang ... Zhiyong Shi
International Journal of Web Information Systems | VOL. 20
Zhiqiang Zhang, et. al.Zhiqiang Zhang ... Zhiyong Shi
12 Jul 2024
International Journal of Web Information Systems | VOL. 20

RETRACTED: Breast cancer diagnosis using multiple activation deep neural network
K Vijayakumar ... Sudhir Kumar Sharma
Concurrent Engineering | VOL. 29
K Vijayakumar, et. al.K Vijayakumar ... Sudhir Kumar Sharma
25 Jun 2021
Concurrent Engineering | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving deep neural networks with multi-layer maxout networks and a novel initialization method

Abstract

Talk to us

Similar Papers

More From: Neurocomputing