Resource-constrained vision tasks, such as image classification on low-end devices, put forward significant challenges due to limited computational resources and restricted access to a vast number of training samples. Previous studies have utilized data augmentation that optimizes various image transformations to learn effective lightweight models with few data samples. However, these studies require a calibration step for optimizing data augmentation to specific scenarios or hardly exploit frequency components readily available from Fourier analysis. To address the limitations, we propose a frequency-based image encoding method, namely FourierAugment, which allows lightweight models to learn richer features with a restrained amount of data. Further, we reveal the correlations between the amount of data and frequency components lightweight models learn in the process of designing FourierAugment. Extensive experiments on multiple resource-constrained vision tasks under diverse conditions corroborate the effectiveness of the proposed FourierAugment method compared to baselines.
Read full abstract