Abstract
The creation of digital surface models (DSMs) from aerial and satellite imagery is often the starting point for different remote sensing applications. For this task, the two main used approaches are stereo matching and multi-view stereo (MVS). The former needs stereo-rectified pairs as inputs and the results are in the disparity domain. The latter works with images from various perspectives and produces a result in the depth domain. So far, both approaches have proven to be successful in producing accurate DSMs, especially in the deep learning area. Nonetheless, an assessment between the two is difficult due to the differences in the input data, the domain where the directly generated results are provided and the evaluation metrics. In this manuscript, we processed synthetic and real optical data to be compatible with the stereo and MVS algorithms. Such data is then applied to learning-based algorithms in both analyzed solutions. We focus on an experimental setting trying to establish a comparison between the algorithms as fair as possible. In particular, we looked at urban areas with high object densities and sharp boundaries, which pose challenges such as occlusions and depth discontinuities. Results show in general a good performance for all experiments, with specific differences in the reconstructed objects. We describe qualitatively and quantitatively the performance of the compared cases. Moreover, we consider an additional case to fuse the results into a DSM utilizing confidence estimation, showing a further improvement and opening up a possibility for further research.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have