Abstract

Learning-based multi-view stereo (MVS) is gaining prominence as a method for 3D reconstruction. However, existing methods in the process of feature learning fail to focus on the structural information implied in the scene. This oversight prevents the network from perceiving the geometric properties of the scene and weakens the generalizability of the network. Therefore, we propose a novel framework named Point Feature Relation Network for Multi-view Stereo (PFR-MVSNet), which is composed of a Dynamic Structure Perception (DSP) module, an Adaptive Feature Exploration (AFE) module, and a Point Transformer Block (PTB) module, to solve the problems caused by the oversight. The DSP module first augments the feature of the 3D point cloud from multi-view 2D features, then establishes spatial structure relations within local regions on the point cloud and guides the feature learning of points through the aggregated structure information. After the network has fully learned the intra-region structure features, the AFE module repartitions perception regions with similar features. The point features within the regions are further learned by the PTB module. We evaluate our method on three benchmark datasets: DTU, Tanks & Temples, and ETH3D. The experimental results show that our method achieves superior accuracy of 0.289 mm on the DTU dataset and exhibits more robust generalization on the Tanks & Temples and ETH3D datasets compared with other learning-based MVS methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call