An important objective in computer vision is to analyze multiple images and subsequently reconstruct the shape and structure in 3D. Traditional multi-view 3D reconstruction techniques extract and match key features from images with known camera parameters. However, this approach is inefficient and fails to fully exploit the advantages of multi-view information. Advancements in deep learning have revolutionized multi-view 3D reconstruction by enabling end-to-end 3D shape inferencing without the need for sequential feature matching typically found in conventional algorithms. Recent rapid progress in this field necessitates a thorough review of current algorithms and provide insight into method of improving 3D reconstruction performance. This review classifies reconstruction algorithms according to their resultant model, including depth map, voxel, point cloud, mesh, and implicit surface. Additionally, this review encompasses the inclusion of frequently employed network training loss functions for network training, assessment metrics, and the incorporation of 3D datasets. Experimental results are also presented to assess the performance of different algorithms. Finally, the paper concludes with a summary, discussion of challenges, and potential future directions.
Read full abstract