Sketch-based 3D shape retrieval (SBSR) can be approached by learning domain-invariant descriptors or ranking metrics from sketches and 2D view images of 3D shapes rendered through numerous viewpoints. However, determining the most appropriate viewpoints that convey discriminative geometric features to benefit the task of SBSR became an essential yet not fully explored area. Existing works extract 3D features from multi-view images observed through pre-defined viewpoints to match 2D sketches. Those methods, however, fail to dynamically select viewpoints by considering the SBSR task. In this work, we introduce a fully differentiable viewpoint learning paradigm driven by the downstream SBSR task, which supports the task-aware and sketch-dependent dynamic viewpoint determination process. We naturally integrate this task-specific and sketch-dependent viewpoint learning process into a meta-learning framework to develop a novel Dynamic Viewer (DV) module for SBSR. DV module comprises a Meta View Learner (MVL) block and a View Generator (VG) block. Specifically, as the first part of the DV module, the MVL block learns to initiate the necessary network parameters of the VG block. Then, the VG block that serves as the second part learns the best viewpoints to render 2D images. To learn the optimal viewpoints for SBSR, we further introduce a view mining loss that aims to maximize the similarity of feature-level information among rendered 2D views and the query sketch. Further, we adopt a variational autoencoder (VAE) to retrieve 3D shapes by setting the newly rendered images and query sketch as inputs. As evidenced by the comprehensive experimental results conducted on popular SBSR datasets, the proposed framework has been demonstrated to outperform recent methods in both category-level sketch-based and fine-grained SBSR.
Read full abstract