Abstract
State-of-the-art methods on sketch classification and retrieval are based on deep convolutional neural network to learn representations. Although deep neural networks have the ability to model images with hierarchical representations by convolution kernels, they can not automatically extract the structural representations of object categories in a human-perceptible way. Furthermore, sketch images usually have large scale visual variations caused by the styles of drawing or viewpoints, which make it difficult to develop generalized representations using the fixed computational mode of convolutional kernel. In this paper, our aim is to address the problem of fixed computational mode in feature extraction process without extra supervision. We propose a novel architecture to dynamically discover the object landmarks and learn the discriminative structural representations. Our model is composed of two components: a representative landmark discovering module that localizes the key points on the object, and a category-aware representation learning module that develops the category-specific features. Specifically, we develop a structure-aware offset layer to dynamically localize the representative landmarks, which is optimized based on the category labels without extra supervision. After that, a diversity branch is introduced to extract the global discriminative features for each category. Finally, we employ a multi-task loss function to develop an end-to-end trainable architecture. At testing time, we fuse all the predictions with different number of landmarks to achieve the final results. Through extensive experiments, we compare our model with several state-of-the-art methods on two challenging datasets TU-Berlin and Sketchy for sketch classification and retrieval, and the experimental results demonstrate the effectiveness of our proposed model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.