GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network

Armin Masoumian,Hatem A Rashwan,Saddam Abdulwahab,Julián Cristiano,M Salman Asif,Domenec Puig

doi:10.1016/j.neucom.2022.10.073

Abstract

Depth estimation is a challenging task of 3D reconstruction to enhance the accuracy sensing of environment awareness. This work brings a new solution with improvements, which increases the quantitative and qualitative understanding of depth maps compared to existing methods. Recently, convolutional neural networks (CNN) have demonstrated their extraordinary ability to estimate depth maps from monocular videos. However, traditional CNN does not support a topological structure, and they can work only on regular image regions with determined sizes and weights. On the other hand, graph convolutional networks (GCN) can handle the convolution of non-Euclidean data, and they can be applied to irregular image regions within a topological structure. Therefore, to preserve object geometric appearances and objects locations in the scene, in this work, we aim to exploit GCN for a self-supervised monocular depth estimation model. Our model consists of two parallel auto-encoder networks: the first is an auto-encoder that will depend on ResNet-50 and extract the feature from the input image and on multi-scale GCN to estimate the depth map. In turn, the second network will be used to estimate the ego-motion vector (i.e., 3D pose) between two consecutive frames based on ResNet-18. The estimated 3D pose and depth map will be used to construct the target image. A combination of loss functions related to photometric, reprojection, and smoothness is used to cope with bad depth prediction and preserve the discontinuities of the objects. Our method and performance are improved quantitatively and qualitatively. In particular, our method provided comparable and promising results with a high prediction accuracy of 89% on the publicly available KITTI dataset. Our method also offers 40% reduction in the number of trainable parameters compared to the state of the art solutions.In addition, we tested our trained model with Make3D dataset to evaluate the trained model on a new dataset with low resolution images. The source code is publicly available at (https://github.com/ArminMasoumian/GCNDepth.git)

Full Text