Training CNNs With Normalized Kernels

Mete Ozay,Takayuki Okatani

doi:10.1609/aaai.v32i1.11624

Abstract

Several methods of normalizing convolution kernels have been proposed in the literature to train convolutional neural networks (CNNs), and have shown some success. However, our understanding of these methods has lagged behind their success in application; there are a lot of open questions, such as why a certain type of kernel normalization is effective and what type of normalization should be employed for each (e.g., higher or lower) layer of a CNN. As the first step towards answering these questions, we propose a framework that enables us to use a variety of kernel normalization methods at any layer of a CNN. A naive integration of kernel normalization with a general optimization method, such as SGD, often entails instability while updating parameters. Thus, existing methods employ ad-hoc procedures to empirically assure convergence. In this study, we pose estimation of convolution kernels under normalization constraints as constraint-free optimization on kernel submanifolds that are identified by the employed constraints. Note that naive application of the established optimization methods for matrix manifolds to the aforementioned problems is not feasible because of the hierarchical nature of CNNs. To this end, we propose an algorithm for optimization on kernel manifolds in CNNs by appropriate scaling of the space of kernels based on structure of CNNs and statistics of data. We theoretically prove that the proposed algorithm has assurance of almost sure convergence to a solution at single minimum. Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Training CNNs With Normalized Kernels

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 29, 2018
Citations: 4

Similar Papers

An attribution graph-based interpretable method for CNNs
Xiangwei Zheng ... Zhen Cui
Neural Networks | VOL. 179
Xiangwei Zheng, et. al.Xiangwei Zheng ... Zhen Cui
05 Aug 2024
Neural Networks | VOL. 179

An efficient method for determining the optimal convolutional neural network structure based on Taguchi method
Pin-Chan Lee ... I-Jyh Wen
Journal of Intelligent & Fuzzy Systems | VOL. 39
Pin-Chan Lee, et. al.Pin-Chan Lee ... I-Jyh Wen
07 Oct 2020
Journal of Intelligent & Fuzzy Systems | VOL. 39

Efficient extraction of deep image features using convolutional neural network (CNN) for applications in detecting and analysing complex food matrices
Yao Liu ... Da-Wen Sun
Trends in Food Science & Technology | VOL. 113
Yao Liu, et. al.Yao Liu ... Da-Wen Sun
06 May 2021
Trends in Food Science & Technology | VOL. 113

Chapter 3 - Convolutional neural networks
Jenni Raitoharju
Deep Learning for Robot Perception and Cognition | VOL. -
Jenni RaitoharjuJenni Raitoharju
01 Jan 2021
Deep Learning for Robot Perception and Cognition | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Training CNNs With Normalized Kernels

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence