Abstract

Deep learning models have attracted tremendous attention in computer vision in recent years, while most of them heavily rely on massive data for training. As one of the solutions to the sparse data problem, data augmentation techniques, such as image translation and rotation, can substantially increase the model’s generalization ability and performance. However, on one hand, these approaches primarily work under the pixel domain, which is limited to fully mining and fusing picture data from the frequency viewpoint. On the other hand, the fusion weighting factors are primarily modified in a manual fashion, which increases the application costs in practice. To this end, we propose a novel method termed as frequency-based Mixup (FreMix) that allows images to be fused in the frequency domain and to improve the efficiency of data augmentation by adaptively adjusting the weighting coefficients in this paper. In FreMix, first, a fast Fourier transformation (FFT) is performed on the input image, such that the frequency information rather than raw pixel information can be extracted for further augmentation. Besides, an exploration-exploitation training paradigm is exploited, such that the FreMix can be trained periodically to facilitate learning and avoid manually hyperparameter settings. We conduct comparing experiments on three benchmark datasets including CIFAR, ImageNet, and ILSVRC2015, and the experimental results validate the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call