Abstract

Reconstruction of a 3D model from a single image is challenging. Nevertheless, recent advances in deep learning methods demonstrated exciting progress toward single-view 3D object reconstruction. However, successful training of a deep learning model requires an extensive dataset with pairs of geometrically aligned 3D models and color images. While manual dataset collection using photogrammetry of laser scanning is challenging, the 3D modeling provides a promising method for data generation. Still, a deep model should be able to generalize from synthetic to real data. In this paper, we evaluate the impact of the synthetic data in the dataset on the performance of the trained model. We use a recently proposed Z-GAN model as a starting point for our research. The Z-GAN model leverages generative adversarial training and a frustum voxel model to provide the state-of-the-art results in the single-view voxel model prediction. We generated a new dataset with 2k synthetic color images and voxel models. We train the Z-GAN model on synthetic, real, and mixed images. We compare the performance of the trained models on real and synthetic images. We provide a qualitative and quantitative evaluation in terms of the Intersection over Union between the ground truth and predicted voxel models. The evaluation demonstrates that the model trained only on the synthetic data fails to generalize to real color images. Nevertheless, a combination of synthetic and real data improves the performance of the trained model. We made our training dataset publicly available (http://www.zefirus.org/SyntheticVoxels).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call