Abstract

When faced with occlusions and non-rigid motions, machines often struggle with depth estimation, a task effortlessly performed by humans with just one eye. Continuous RGB images embody rich temporal features, such as symmetry and optical flow, which current deep-learning models fail to effectively leverage. In response to this limitation, we introduce an innovative framework known as Temporal Symmetry-based Uncertainty (TSU)-Depth, aimed at enhancing the accuracy of unsupervised monocular depth estimation. The Temporal Symmetry-based Occlusion Optimization (TSOO) component plays a pivotal role in robustly identifying occluded regions and comparable optimization across adjacent frames. Simultaneously, we propose Temporal Optical Flow Masking (TOFM) to effectively identify and exclude static pixels (such as out-of-range depths and non-rigid objects) between adjacent frames. Additionally, we introduce Cross-Resolution Distillation (CRED) to enhance depth estimation accuracy across various resolutions, especially in low input resolution scenarios. Furthermore, we designed a new depth estimation structure utilizing the DPT structure and incorporating a GRU module to enhance performance details. Through extensive experiments on benchmark datasets, including KITTI, Cityscapes, and Make3D, our TSUDepth framework has consistently demonstrated state-of-the-art performance. Code is available at https://github.com/BlueEg/TSUDepth/.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call