Monocular depth estimation with SPN loss

Alwyn Mathew,Jimson Mathew

doi:10.1016/j.imavis.2020.103934

Abstract

Understanding the 3D space is crucial for autonomous vehicles for planning and navigation. Traditionally autonomous vehicles use LiDAR sensor to 3D map its environment. LiDAR sensor data are often noisy and sparse making it not fully reliable for real-time applications like autonomous driving, thus redundant such sensors are used for the purpose. The array of cameras in an autonomous vehicle purposed for detection and tracking can be reused for depth estimation as well. In this paper, an unsupervised monocular depth estimation approach for autonomous vehicles which can be used as redundant depth estimators replacing multiple LiDAR sensors. Here, a deep learning based method is used with a multiscale encoder-decoder network to estimate depth. Target view among the stereo pairs is reconstructed by inverse warping the source view using geometric camera projection. The network is guided by the stereo positive–negative(SPN) loss which minimizes the loss between reconstructed view and corresponding stereo ground truth and, also maximizes the loss between reconstructed views and corresponding opposite stereo ground truth. The proposed approach shows state of the art accuracy in autonomous driving dataset KITTI.

Full Text