Multi-view Stereo Datasets Research Articles

Although having achieved the promising results on shape and color recovery through self-supervision, the multi-layer perceptrons-based methods usually suffer from heavy computational cost on learning the deep implicit surface representation. Since rendering each pixel requires a forward network inference, it is very computational intensive to synthesize a whole image. To tackle these challenges, we propose an effective coarse-to-fine approach to recover the textured mesh from multi-views in this paper. Specifically, a differentiable Poisson Solver is employed to represent the object's shape, which is able to produce topology-agnostic and watertight surfaces. To account for depth information, we optimize the shape geometry by minimizing the differences between the rendered mesh and the predicted depth from multi-view stereo. In contrast to the implicit neural representation on shape and color, we introduce a physically based inverse rendering scheme to jointly estimate the environment lighting and object's reflectance, which is able to render the high resolution image at real-time. The texture of the reconstructed mesh is interpolated from a learnable dense texture grid. We have conducted the extensive experiments on several multi-view stereo datasets, whose promising results demonstrate the efficacy of our proposed approach. The code is available at https://github.com/l1346792580123/diff.

Read full abstract

Confidence prediction task attempts to infer the correctness of estimated depth hypotheseshich has gained popularity recently in stereo matching and boosts the accuracy of disparity estimation. However, less attention is paid on confidence prediction of multi-view stereo (MVS), where multi-view depth estimation is a key step for high-quality reconstruction. In this work, we propose a Geometry-consistent Confidence prediction Network (GeoConfNet), where the correctness of a depth hypothesis is accurately predicted via a deep neural network that explores both spatial coherence and cross-view consistency. The proposed deep network consists of a feature extraction module, a U-Net-based fusion module and a confidence refinement module. Furthermore, we demonstrate that truncated signed distance field (TSDF) is a powerful cross-view feature which can be an effective complement to spatial features, thereby remarkably boosting confidence prediction accuracy of MVS. Exhaustive experiments on a variety of MVS datasets as well as stereo matching datasets clearly demonstrate that our method achieves significantly better performance than state-of-the-art methods in terms of area under the curve (AUC).

Read full abstract

Multi-view Stereo Datasets Research Articles

Articles published on Multi-view Stereo Datasets

Multiview Textured Mesh Recovery by Differentiable Rendering

High accuracy and geometry-consistent confidence prediction network for multi-view stereo

Depth-map completion for large indoor scene reconstruction

A Point-Cloud-Based Multiview Stereo Algorithm for Free-Viewpoint Video

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multi-view Stereo Datasets Research Articles

Articles published on Multi-view Stereo Datasets

Multiview Textured Mesh Recovery by Differentiable Rendering

High accuracy and geometry-consistent confidence prediction network for multi-view stereo

Depth-map completion for large indoor scene reconstruction

A Point-Cloud-Based Multiview Stereo Algorithm for Free-Viewpoint Video