Depth completion aims to recover pixelwise depth from incomplete and noisy depth measurements with or without the guidance of a reference RGB image. This task attracted considerable research interest due to its importance in various computer vision-based applications, such as scene understanding, autonomous driving, 3-D reconstruction, object detection, pose estimation, trajectory prediction, and so on. As the system input, an incomplete depth map is usually generated by projecting the 3-D points collected by ranging sensors, such as LiDAR in outdoor environments, or obtained directly from RGB-D cameras in indoor areas. However, even if a high-end LiDAR is employed, the obtained depth maps are still very sparse and noisy, especially in the regions near the object boundaries, which makes the depth completion task a challenging problem. To address this issue, a few years ago, conventional image processing-based techniques were employed to fill the holes and remove the noise from the relatively dense depth maps obtained by RGB-D cameras, while deep learning-based methods have recently become increasingly popular and inspiring results have been achieved, especially for the challenging situation of LiDAR-image-based depth completion. This article systematically reviews and summarizes the works related to the topic of depth completion in terms of input modalities, data fusion strategies, loss functions, and experimental settings, especially for the key techniques proposed in deep learning-based multiple input methods. On this basis, we conclude by presenting the current status of depth completion and discussing several prospects for its future research directions.
Read full abstract