Towards an Effective Orthogonal Dictionary Convolution Strategy

Yishi Li,Lin Gu,Rui Lai,Kunran Xu

doi:10.1609/aaai.v36i2.20037

Abstract

Orthogonality regularization has proven effective in improving the precision, convergence speed and the training stability of CNNs. Here, we propose a novel Orthogonal Dictionary Convolution Strategy (ODCS) on CNNs to improve orthogonality effect by optimizing the network architecture and changing the regularized object. Specifically, we remove the nonlinear layer in typical convolution block “Conv(BN) + Nonlinear + Pointwise Conv(BN)”, and only impose orthogonal regularization on the front Conv. The structure, “Conv(BN) + Pointwise Conv(BN)”, is then equivalent to a pair of dictionary and encoding, defined in sparse dictionary learning. Thanks to the exact and efficient representation of signal with dictionaries in low-dimensional projections, our strategy could reduce the superfluous information in dictionary Conv kernels. Meanwhile, the proposed strategy relieves the too strict orthogonality regularization in training, which makes hyper-parameters tuning of model to be more flexible. In addition, our ODCS can modify the state-of-the-art models easily without any extra consumption in inference phase. We evaluate it on a variety of CNNs in small-scale (CIFAR), large-scale (ImageNet) and fine-grained (CUB-200-2011) image classification tasks, respectively. The experimental results show that our method achieve a stable and superior improvement.

Full Text