Abstract
The goal of few-shot image classification is to learn a classifier that can be well generalized to the unseen classes with a few available labeled samples. One major challenge for few-shot learning is how to conduct effective image representation for support and query images. Recently, local region-based image representation and metric learning approaches have been demonstrated effectively for few-shot classification problem. However, existing approaches generally conduct representations of image regions individually which thus lack of considering the rich spatial/structural relationships among image regions. In this paper, we propose to bridge the individual regions and exploit the structural contexts among regions via a novel Region-Graph Transformer (RGTransformer). In RGTransformer, each region aggregates the information from its neighboring regions and thus can obtain context-aware feature representations for regions. Using the proposed RGTransformer, we propose an effective metric learning model for few-shot image classification. We evaluate the proposed method on four benchmark datasets and experimental results demonstrate the effectiveness and advantages of the proposed RGTransformer.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have