DSC-MDE: Dual structural contexts for monocular depth estimation

Wubin Yan,Lijun Dong,Wei Ma,Qing Mi,Hongbin Zha

doi:10.1016/j.knosys.2023.110308

Wubin Yan, Lijun Dong + Show 3 more

https://doi.org/10.1016/j.knosys.2023.110308

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Geometric and semantic contexts are essential to solving the ill-posed problem of monocular depth estimation (MDE). In this paper, we propose a deep MDE framework that can aggregate dual-modal structural contexts for monocular depth estimation (DSC-MDE). First, a cross-shaped context (CSC) aggregation module is developed to globally encode the geometric structures in depth maps observed under the fields of vision of robots/autonomous vehicles. Next, the CSC-encoded geometric features are further modulated with semantic context in an object-regional context (ORC) aggregation module. Finally, to train the proposed network, we present a focal ordinal loss (FOL), which pays more attention to distant samples to avoid the issue of over-relaxed constraints on these samples occurring in the ordinal regression loss (ORL). We compare the proposed model to recent methods with geometric and multi-modal contexts, and show that the proposed model obtains state-of-the-art performance on both indoor and outdoor datasets, including NYU-Depth-V2, Cityscapes and KITTI.

Full Text