Abstract

Few-shot learning is a challenging task, that aims to learn and identify novel classes from a limited number of unseen labeled samples. Previous work has focused primarily on extracting features solely in the spatial domain of images. However, the compressed representation in the frequency domain which contains rich pattern information is a powerful tool in the field of signal processing. Combining the frequency and spatial domains to obtain richer information can effectively alleviate the overfitting problem. In this paper, we propose a dual-domain combined model called Frequency Space Net (FSNet), which preprocesses input images simultaneously in both the spatial and frequency domains, extracts spatial and frequency information through two feature extractors, and fuses them to a composite feature for image classification tasks. We start from a different view of frequency analysis, linking conventional average pooling to Discrete Cosine Transformation (DCT). We generalize the compression of the attention mechanism in the frequency domain. Consequently, we propose a novel Frequency Channel Spatial (FCS) attention mechanism. Extensive experiments demonstrate that frequency and spatial information are complementary in few-shot image classification, improving the performance of the model. Our method outperforms state-of-the-art approaches on miniImageNet and CUB.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call