Abstract

ABSTRACT With the increased availability of multi-view satellite images, the number of investigations on 3D urban scene reconstruction from multiple satellite images is also increasing. Conventional Multi-View Stereo (MVS) pipelines require the calibrated pose information of the satellite cameras to determine the epipolar geometry and the 3D structure of the stereo correspondences. In this study, we propose a novel Monocular Height estimation and Fusion (MHF) method for 3D reconstruction from uncalibrated multi-view satellite images. By employing a learned monocular depth network, the proposed method first obtains the height map of each satellite image. Second, all height maps obtained from the multi-view images are fused to a refined height map in each image plane. To fuse the height maps, all maps are affine transformed to a virtual reference coordinate system and the transformed maps are then projected to the image plane of each camera coordinate system. The monocular depth network was trained and evaluated on the Data Fusion Contest 2019 (DFC19) dataset including Jacksonville, FL, and Omaha, NE. We also evaluate the ATL-SN4 dataset covering Atlanta, GA to test on untrained new urban scenes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.