A Mono SLAM Method Based on Depth Estimation by DenseNet-CNN

Yifan Jin,Shumin Fei,Lei Yu,Zhong Chen

doi:10.1109/jsen.2021.3134014

Abstract

Currently, SLAM (simultaneous localization and mapping) systems based on monocular cameras cannot directly obtain depth information, and most of them have problems with scale uncertainty and need to be initialized. In some application scenarios that require navigation and obstacle avoidance, the inability to achieve dense mapping is also a defect of monocular SLAM. In response to the above problems, this paper proposes a method which learns depth estimation by DenseNet and CNN for a monocular SLAM system. We use an encoder-decoder architecture based on transfer learning and convolutional neural networks to estimate the depth information of monocular RGB images. At the same time, through the front-end ORB feature extraction and the back-end direct RGB-D Bundle Adjustment optimization method, it is possible to obtain accurate camera poses and achieve dense indoor mapping when using estimated depth information. The experimental results show that the monocular depth estimation model used in this paper can achieve good results, and it is also competitive in comparison with the current popular methods. On this basis, the error of camera pose estimation is also smaller than traditional monocular SLAM solutions and can complete the dense indoor reconstruction task. It is a complete SLAM system based on monocular camera.

Full Text