Self-Supervised Monocular Depth Estimation With Extensive Pretraining

Hyukdoo Choi

doi:10.1109/access.2021.3129628

Abstract

Although depth estimation is a key technology for three-dimensional sensing applications involving motion, active sensors such as LiDAR and depth cameras tend to be expensive and bulky. Here, we explore the potential of monocular depth estimation (MDE) using a self-supervised approach. MDE is a promising technology, but supervised learning suffers from a need for accurate ground-truth depth data. Recent studies have enabled self-supervised training on an MDE model with only monocular image sequences and image-reconstruction errors. We pretrained networks using multiple datasets, including monocular and stereo image sequences. The main challenges posed by the self-supervised MDE model were occlusions and dynamic objects. We proposed novel loss functions to handle these problems in the form of min-over-all and min-with-flow losses, both based on the per-pixel minimum reprojection error of Monodepth2 and extended to stereo images and optical flow. With extensive pretraining and novel losses, our model outperformed existing unsupervised approaches in quantitative depth estimation and the ability to distinguish small objects against a background, as evaluated by KITTI 2015.

Highlights

Three-dimensional (3D) vision involves inferring 3D geometric information from two-dimensional (2D) images
As depth information is important to moving vehicles, we used driving datasets such as KITTI, Cityscapes, Waymo and A2D2 [1]–[4]
With a convolutional neural network (CNN), numerous kernels are automatically adjusted for accurate depth prediction, and less pre- and postprocessing and regularization are required

Summary

Introduction

Three-dimensional (3D) vision involves inferring 3D geometric information from two-dimensional (2D) images. Monocular depth estimation (MDE) produces a dense depth map from a single image. The weights to be multiplied are optimized to predict true depths during training Their results are relatively inaccurate, with average relative depth error rates of greater than 30%. With a CNN, numerous kernels are automatically adjusted for accurate depth prediction, and less pre- and postprocessing and regularization are required. While these methods produce the most accurate results [7], [9], with a relative depth error rate below 10%, they require depthlabeled datasets. The core of self-supervision is photometric loss, which pairs temporally adjacent or stereo images and synthesizes one image from the other using an estimated depth map and the relative pose between them. The difference between the synthesized and original images represents the depth and pose estimation error, VOLUME XX, 2017

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-Supervised Monocular Depth Estimation With Extensive Pretraining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Journal: IEEE Access	Publication Date: Jan 1, 2021
License type: CC BY 4.0

Similar Papers

Self-supervised learning of monocular depth using quantized networks
Keyu Lu ... Yonghu Zeng
Neurocomputing | VOL. 488
Keyu Lu, et. al.Keyu Lu ... Yonghu Zeng
06 Dec 2021
Neurocomputing | VOL. 488

Monocular Depth Estimation via Self-Supervised Self-Distillation.
Haifeng Hu ... Yuyang Feng
Sensors (Basel, Switzerland) | VOL. 24
Haifeng Hu, et. al.Haifeng Hu ... Yuyang Feng
24 Jun 2024
Sensors (Basel, Switzerland) | VOL. 24

Self-Supervised Monocular Depth Estimation With Geometric Prior and Pixel-Level Sensitivity
Jierui Liu ... Shuo Wang
IEEE Transactions on Intelligent Vehicles | VOL. 8
Jierui Liu, et. al.Jierui Liu ... Shuo Wang
01 Mar 2023
IEEE Transactions on Intelligent Vehicles | VOL. 8

R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of Dynamic Scenes
Stefano Gasperini ... Federico Tombari
-
Stefano Gasperini, et. al.Stefano Gasperini ... Federico Tombari
01 Dec 2021
01 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Supervised Monocular Depth Estimation With Extensive Pretraining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access