Abstract

Learning single-image 3D reconstruction with only 2D images supervision is a promising research topic. The main challenge in image-supervised 3D reconstruction is the shape-pose ambiguity, which means a 2D supervision can be explained by an erroneous 3D shape from an erroneous pose. It will introduce high uncertainty and mislead the learning process. Existed works rely on multi-view images or pose-aware annotations to resolve the ambiguity. In this paper, we propose to resolve the ambiguity without extra pose-aware labels or annotations. Our training data is single-view images from the same object category. To overcome the shape-pose ambiguity, we introduce a pose-independent GAN to learn the category-specific shape manifold from the image collections. With the learned shape space, we resolve the shape-pose ambiguity in original images by training a pseudo pose regressor. Finally, we learn a reconstruction network with both the common re-projection loss and a pose-independent discrimination loss, making the results plausible from all views. Through experiments on synthetic and real image datasets, we demonstrate that our method can perform comparably to existing methods while not requiring any extra pose-aware annotations, making it more applicable and adaptable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call