Abstract

BackgroundMonocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance. MethodsTo remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module. We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios. ResultsExtensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call