Although deep networks-based 3D reconstruction methods can recover the 3D geometry given few inputs, they may produce unfaithful reconstruction when predicting occluded parts of 3D objects. To address the issue, we propose Detail-Enhanced Generative Adversarial Network (DEGAN) which consists of Encoder–Decoder-Based Generator (EDGen) and Voxel-Point Embedding Network-Based Discriminator (VPDis) for 3D reconstruction from a monocular depth image of an object. Firstly, EDGen decodes the features from the 2.5D voxel grid representation of an input depth image and generates the 3D occupancy grid under GAN losses and a sampling point loss. The sampling loss can improve the accuracy of predicted points with high uncertainty. VPDis helps reconstruct the details under voxel and point adversarial losses, respectively. Experimental results show that DEGAN not only outperforms several state-of-the-art methods on both public ModelNet and ShapeNet datasets but also predicts more reliable occluded/missing parts of 3D objects.
Read full abstract