Abstract

Deep convolutional neural networks (CNNs), renowned for their consistent performance, are widely understood by practitioners that the stability of learning depends on the initialization of the model parameters in each layer. Kaiming initialization, the de facto standard, is derived from a much simpler CNN model which consists of only the convolution and fully connected layers. Compared to the current CNN models, the basis CNN model for the Kaiming initialization does not include the max pooling or global average pooling layers. In this study, we derive an new initialization scheme formulated from modern CNN architectures, and empirically investigate the performance of the new initialization methods compared to the standard initialization methods widely used today.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call