Abstract

The traditional feature-based visual SLAM algorithm is based on the static environment assumption when recovering scene information and camera motion. The dynamic objects in the scene will affect the positioning accuracy. In this paper, we propose to combine the image semantic segmentation based on deep learning method with the traditional visual SLAM framework to reduce the interference of dynamic objects on the positioning results. Firstly, a supervised Convolutional Neural Network (CNN) is used to segment objects in the input image to obtain the semantic image. Secondly, the feature points are extracted from the original image, and the feature points of the dynamic objects (cars and pedestrians) are eliminated according to the semantic image. Finally, the traditional monocular SLAM method is used to track the camera motion based on the eliminated feature points. The experiments on the Apolloscape datasets show that compared with the traditional method, the proposed method improves the positioning accuracy in dynamic scenes by about 17%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call