Abstract

Abstract. Extracting detailed geometric information about a scene relies on the quality of the depth maps (e.g. Digital Elevation Surfaces, DSM) to enhance the performance of 3D model reconstruction. Elevation information from LiDAR is often expensive and hard to obtain. The most common approach to generate depth maps is through multi-view stereo (MVS) methods (e.g. dense stereo image matching). The quality of single depth maps, however, is often prone to noise, outliers, and missing data points due to the quality of the acquired image pairs. A reference multi-view image pair must be noise-free and clear to ensure high-quality depth maps. To avoid such a problem, current researches are headed toward fusing multiple depth maps to recover the shortcomings of single-depth maps resulted from a single pair of multi-view images. Several approaches tackled this problem by merging and fusing depth maps, using probabilistic and deterministic methods, but few discussed how these fused depth maps can be refined through adaptive spatiotemporal analysis algorithms (e.g. spatiotemporal filters). The motivation is to push towards preserving the high precision and detail level of depth maps while optimizing the performance, robustness, and efficiency of the algorithm.

Highlights

  • 1.1 BackgroundOver the last few decades, a large number of Very High Resolution (VHR) Satellites are established to provide sub-meter resolution imageries, with frequent re-visiting times during the year to allow extracting comprehensive 3D geometrical information about the scene

  • Images captured by satellite sensors are prone to spectral inconsistencies and distortions, which may affect the dense stereo matching algorithm and produce incorrect or missing height information, and degrade the quality of the depth map

  • Multi-view stereo (MVS) algorithms are very sensitive to temporal inconsistencies between the images, they cannot be used directly to obtain 3D models and generate height information like the Digital surface model (DSM)

Read more

Summary

Introduction

Over the last few decades, a large number of Very High Resolution (VHR) Satellites are established to provide sub-meter resolution imageries, with frequent re-visiting times during the year to allow extracting comprehensive 3D geometrical information about the scene Algorithms such as Multi-view stereo (MVS) highly depend on the spatial and temporal resolutions of sensors to facilitate generating reliable 3D reconstructed models. The acquisition conditions and measurement errors such as distance to sensor, the lighting conditions, occlusions in the scene (e.g. Tree obstructing a building), and not enough overlap between the images can complicate unique feature matching (Qin, 2019) Object properties and their pattern in the scene can increase the uncertainty of the generated height, such as thin structures, texture-less surfaces, featureless areas, and repeated patterns or structures directly affect stereo matching. All these errors can cause the height data to be temporally inconsistent and lead to holes, noise, missing data, blurry artifacts, and fuzzy edges and boundaries, incomplete and unreliable representation of the 3D information

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call