Abstract

Low-shot sketch-based image retrieval is an emerging task in computer vision, allowing to retrieve natural images relevant to hand-drawn sketch queries that are rarely seen during the training phase. Related prior works either require aligned sketch-image pairs that are costly to obtain or inefficient memory fusion layer for mapping the visual information to a semantic space. In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks, where we introduce the few-shot setting for SBIR. For solving these tasks, we propose a semantically aligned paired cycle-consistent generative adversarial network (SEM-PCYC) for any-shot SBIR, where each branch of the generative adversarial network maps the visual information from sketch and image to a common semantic space via adversarial training. Each of these branches maintains cycle consistency that only requires supervision at the category level, and avoids the need of aligned sketch-image pairs. A classification criteria on the generators’ outputs ensures the visual to semantic space mapping to be class-specific. Furthermore, we propose to combine textual and hierarchical side information via an auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in any-shot SBIR performance over the state-of-the-art on the extended version of the challenging Sketchy, TU-Berlin and QuickDraw datasets.

Highlights

  • We propose a semantically aligned paired cycle consistent generative adversarial network (SEM-PCYC) model for any-shot sketchbased image retrieval (SBIR) task, where each branch either maps the sketch or image features to a common semantic space via an adversarial training

  • We show that our proposed model consistently improves the state-ofthe-art results of any-shot SBIR on all the three datasets

  • We propose a paired cycle consistent generative model where each branch either maps sketch or image features to a common semantic space via adversarial training, which we found to be effective for reducing the domain gap between sketch and image

Read more

Summary

Introduction

Matching natural images with free-hand sketches, i.e. sketchbased image retrieval (SBIR) (Yu et al 2015, 2016a; Liu et al 2017; Pang et al 2017; Song et al 2017b; Shen et al 2018; Zhang et al 2018; Chen and Fang 2018; Kiran Yelamarthi et al 2018; Dutta and Akata 2019; Dey et al 2019) has received a lot of attention. Since for practical applications there is no guarantee that the training data would include all possible queries, a more realistic setting is low-shot or any-shot SBIR (ASSBIR) (Shen et al 2018; Kiran Yelamarthi et al 2018; Dutta and Akata 2019; Dey et al 2019), which combines zeroand few-shot learning (Lampert et al 2014; Vinyals et al 2016; Xian et al 2018a; Ravi and Larochelle 2017) and SBIR as a single task, where the aim is an accurate class prediction and a competent retrieval performance. Fine-grained SBIR (Pang et al 2017, 2019) is an alternative sketch-based image retrieval task, allowing to search for specific object images, International Journal of Computer Vision (2020) 128:2684–2703

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call