Abstract

We study the problem of fine-grained sketch-based image retrieval. By performing instance-level (rather than category-level) retrieval, it embodies a timely and practical application, particularly with the ubiquitous availability of touchscreens. Three factors contribute to the challenging nature of the problem: 1) free-hand sketches are inherently abstract and iconic, making visual comparisons with photos difficult; 2) sketches and photos are in two different visual domains, i.e., black and white lines versus color pixels; and 3) fine-grained distinctions are especially challenging when executed across domain and abstraction-level. To address these challenges, we propose to bridge the image-sketch gap both at the high level via parts and attributes, as well as at the low level via introducing a new domain alignment method. More specifically, first, we contribute a data set with 304 photos and 912 sketches, where each sketch and image is annotated with its semantic parts and associated part-level attributes. With the help of this data set, second, we investigate how strongly supervised deformable part-based models can be learned that subsequently enable automatic detection of part-level attributes, and provide pose-aligned sketch-image comparisons. To reduce the sketch-image gap when comparing low-level features, third, we also propose a novel method for instance-level domain-alignment that exploits both subspace and instance-level cues to better align the domains. Finally, fourth, these are combined in a matching framework integrating aligned low-level features, mid-level geometric structure, and high-level semantic attributes. Extensive experiments conducted on our new data set demonstrate effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call