Abstract

The visual odometry based on the feature point method is solved by matching spatial points with pixel points using the PnP algorithm when calculating the pose change between the front and back frames. Accurate spatial location of feature points plays an important role in the process of visual odometry calculation. Incorrectly positioned space points greatly affect the performance of the visual odometer. The depth value calculation method based on deep learning can effectively improve the accuracy of spatial point locations, however, this method does not focus on the correctness of depth values at feature points. This paper proposes a way to integrate deep learning and traditional approach to construct a stereo VO system which pays attention on the feature points’ depth to improve the accuracy of the visual odometry indirectly. Specifically, the training process is divided into two phases. The first phase trains a stereo matching network using a binocular dataset to obtain the initial model of the network. In the second stage, a feature extraction network is added to obtain a feature point mask from the extracted feature points, and a loss function is built using the mask, while a reprojection loss function is built using the poses values. The two loss functions are added to the first stage loss function during the second stage training. Finally, the trained stereo matching network is used to generate the depth values, and the matched feature points are obtained using the feature matching network, and the stereo visual odometry is constructed by calculating the relative pose between the former and latter frames through the PnP algorithm. Extensive experiments on the KITTI dataset show the robustness of our system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call