In this article, we present a deep learning approach to sketch-based shape retrieval that incorporates a few novel techniques to improve the quality of the retrieval results. First, to address the problem of scarcity of training sketch data, we present a sketch augmentation method that more closely mimics human sketches compared to simple image transformation. Our method generates more sketches from the existing training data by (i) removing a stroke, (ii) adjusting a stroke, and (iii) rotating the sketch. As such, we generate a large number of sketch samples for training our neural network. Second, we obtain the 2D renderings of each 3D model in the shape database by determining the view positions that best depict the 3D shape: i.e., avoiding self-occlusion, showing the most salient features, and following how a human would normally sketch the model. We use a convolutional neural network (CNN) to learn the best viewing positions of each 3D model and generates their 2D images for the next step. Third, our method uses a cross-domain learning strategy based on two Siamese CNNs that pair up sketches and the 2D shape images. A joint Bayesian measure is used to measure the output similarity from these CNNs to maximize inter-class similarity and minimize intra-class similarity. Extensive experiments show that our proposed approach comprehensively outperforms many existing state-of-the-art methods.
Read full abstract