Abstract

Deep neural networks have demonstrated their effectiveness in computer vision, especially for image classification and detection. Mixup is recently proposed as a data augmentation scheme, which applies linear combination of two random training examples and corresponding targets. However, the linear assumption is inappropriate for training a non-linear model. In this paper, we propose a self-supervised method which requires the consistency of original and mixed images on feature space. Our work is motived by the semantic information is related to the relative position of features. To implement this idea effectively and efficiently, we perform two-stage of training procedure, i.e., running estimation of class centers of original data in feature space, and training deep neural networks with modified loss term of Mixup. Besides, the proposed approach is also compatible with other variants of Mixup. We validate our approach on two popular image classification datasets, CIFAR10 and CIFAR100 by a variety of advanced deep neural networks, and demonstrate consistent generalization improvements, sometimes significantly. We also conduct analytical experiments to evaluate the robustness of our method to hyperparameters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call