Three-dimensional object reconstruction technology has a wide range of applications such as augment reality, virtual reality, industrial manufacturing and intelligent robotics. Although deep learning-based 3D object reconstruction technology has developed rapidly in recent years, there remain important problems to be solved. One of them is that the resolution of reconstructed 3D models is hard to improve because of the limitation of memory and computational efficiency when deployed on resource-limited devices. In this paper, we propose 3D-RVP to reconstruct a complete and accurate 3D geometry from a single depth view, where R, V and P represent Reconstruction, Voxel and Point, respectively. It is a novel two-stage method that combines a 3D encoder-decoder network with a point prediction network. In the first stage, we propose a 3D encoder-decoder network with residual learning to output coarse prediction results. In the second stage, we propose an iterative subdivision algorithm to predict the labels of adaptively selected points. The proposed method can output high-resolution 3D models by increasing a small number of parameters. Experiments are conducted on widely used benchmarks of a ShapeNet dataset in which four categories of models are selected to test the performance of neural networks. Experimental results show that our proposed method outperforms the state-of-the-arts, and achieves about 2.7% improvement in terms of the intersection-over-union metric.
Read full abstract