3D point cloud reconstruction is an urgent task in computer vision for environment perception. Nevertheless, the reconstructed scene is inaccurate and incomplete, because the visibility of pixels is not taken into account by existing methods. In this paper, a cascaded network with a multiple cost volume aggregation module named ADR-MVSNet is proposed. Three improvements are presented in ADR-MVSNet. First, to improve the reconstruction accuracy and reduce the time complexity, an adaptive depth reduction module, which adaptively adjusts the depth range of the pixel through the confidence interval, is proposed. Second, to more accurately estimate the depth of occluded pixels in multiview images, a multiple cost volume aggregation module, in which Gini impurity is introduced to measure the confidence of pixel depth prediction, is proposed. Third, a multiscale photometric consistency filter module is proposed, which considers the information in multiple confidence maps at the same time and filters out outliers accurately to remove pixels with low confidence. Therefore, the accuracy of point cloud reconstruction is improved. The experimental results on the DTU and Tanks and Temple datasets demonstrate that ADR-MVSNet achieves highly accurate and highly complete reconstruction compared with state-of-the-art benchmarks.
Read full abstract