Abstract

Currently, SLAM (simultaneous localization and mapping) systems based on monocular cameras cannot directly obtain depth information, and most of them have problems with scale uncertainty and need to be initialized. In some application scenarios that require navigation and obstacle avoidance, the inability to achieve dense mapping is also a defect of monocular SLAM. In response to the above problems, this paper proposes a method which learns depth estimation by DenseNet and CNN for a monocular SLAM system. We use an encoder-decoder architecture based on transfer learning and convolutional neural networks to estimate the depth information of monocular RGB images. At the same time, through the front-end ORB feature extraction and the back-end direct RGB-D Bundle Adjustment optimization method, it is possible to obtain accurate camera poses and achieve dense indoor mapping when using estimated depth information. The experimental results show that the monocular depth estimation model used in this paper can achieve good results, and it is also competitive in comparison with the current popular methods. On this basis, the error of camera pose estimation is also smaller than traditional monocular SLAM solutions and can complete the dense indoor reconstruction task. It is a complete SLAM system based on monocular camera.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call