Abstract
Monocular depth estimation, as one of the fundamental tasks of computer vision, plays a crucial role in three-dimensional (3D) scene understanding and perception. Usually, deep learning methods recover monocular depth maps using continuous regression manners by minimizing the errors between the ground-truth depth and the predicted depth. However, fine depth features may not be fully captured through layer-by-layer coding, which is prone to low spatial resolution depth maps and insufficient details. Furthermore, it usually converges slowly and suffers from unsatisfactory results. To tackle these issues, we propose a novel model, named context-based ordinal regression network (CORNet), to reconstruct monocular depth maps in the ordinal regression manner with context information in this paper. Firstly, we put forward a novel context-based encoder with a feature transformation (FT) module to learn context information and details from inputs, and output multi-scale feature maps. Then, we design a boundary enhancement module (BEM) with a spatial attention mechanism following each operation of feature fusion, which captures boundary features in the scene to enhance the border depth. Finally, a feature optimization module (FOM) is designed to fuse and optimize the multi-scale features and boundary features to strengthen depth learning. What’s more, we introduce an ordinal weighted inference to predict depth maps from probabilities and discretization values. Experiments and results on two challenging datasets, KITTI and NYU Depth V2, demonstrate that our proposed CORNet can estimate monocular depth maps effectively and obtain superior performance in capturing geometric features over existing methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.