Explicitly learning augmentation invariance for image classification by Consistent Augmentation

Xiaosong Li,Yanxia Wu,Chuheng Tang,Yan Fu,Lidan Zhang

doi:10.1016/j.engappai.2023.107541

Abstract

Data augmentation is a powerful and widely used technique to improve the generalization of convolutional neural networks. The majority of data augmentation methods emphasize transforming samples to bolster network robustness. However, these methods overlook learning the invariance between the samples augmented by different transformations, impairing the quality of learned representations and generalization of the model. In this paper, we introduce a straightforward yet highly effective method called Consistent Augmentation (CA) to compel networks to learn explicitly the invariance of transformations. Specifically, CA minimizes the bidirectional KL divergence between the predicted distributions of different variants generated from the same samples using different transformations, explicitly enforcing consistency among various semantics-preserving transformations. Implementing CA is intuitive, uncomplicated, and does not introduce additional computational costs during inference. The validation experiments performed on conventional and fine-grained classification tasks demonstrate that CA can be employed across various network architectures and consistently improve their generalization ability. Notably, convolutional neural networks benefit significantly on fine-grained classification tasks, with an improvement of 5.18% and 6.61% compared with other augmentation methods in accuracy observed for the ShuffleNetV2 model on the Stanford Dogs and CUB-200-2011 datasets. Compatibility experiments conducted on the CIFAR-100 dataset reveal that CA improves the accuracy of the Mixup and CutMix methods by 1.25% and 2.11%, respectively. These results underscore the effectiveness of our approach when employed alongside mix-based augmentation methods, further enhancing the capabilities of the model.

Full Text