Abstract

2D image-based 3D shape retrieval (2D-to-3D) aims at searching the corresponding 3D shapes (unlabeled) when given a 2D image (labeled), which is a fundamental task in computer vision and has gained a surge of attention in recent years. However, extensive prior works are limited by two settings, 1) reducing domain discrepancy while ignoring the 3D shape style, 2) 3D shapes are simply and brutally pseudo-annotated by the 2D image-supervised classifier, neglecting the structure information underlying the 3D shape domain. To remedy these issues, we propose a feature transformation framework with selective pseudo-labeling (FTSPL) for 2D-to-3D task. Specifically, we first employ CNNs to produce both 2D image and 3D shape (described as multiple views) features, then we force the inter-domain centroid alignment class-wisely to reduce the overall domain discrepancy. In addition to this, we further exploit the intra-category attribute variation (covariance) of 3D shape features to transform the 2D image features. By doing so, we can equip 2D features with 3D shape style. Since the centroid and covariance estimation of 3D shape features require accurate label predictions, we put forward a selective pseudo-labeling module, which can assign reliable pseudo-labels for 3D shapes via nearest category centroid and cluster analysis, respectively, while preserving the structure information of 3D shapes. Comprehensive experiments validate that our model surpasses the state-of-the-arts on standard 2D-to-3D benchmarks (MI3DOR and MI3DOR-2).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call