Improving Relocalization in Visual SLAM by using Object Detection

Pished Bunnun,Nithid Mahattansin,Tsuyoshi Isshiki,Kanjanapan Sukvichai

doi:10.1109/ecti-con54298.2022.9795637

Abstract

Visual simultaneous localization and mapping (VS-LAM) is the useful algorithm for localizing and mapping especially for the indoor mobile robot application. The VSLAM works based on features extracted from surrounding images. The limitation of VSLAM is that a speed of its relocalization algorithm is slow due to a large number of candidates. This research aims to improve the relocalization of VSLAM by using semantic information as a new constraint to determine candidates. The basic idea is to use a deep neural network which is YOLO, to classify objects from an image frame and create a high-level feature array to represent objects in the frame. By using this array, the algorithm can discard many poor candidates and decrease the computation time of the relocalization process. The proposed approach was implemented into the 3 popular VSLAM frameworks which are ORB-SLAM2, OpenVSLAM and RTAB-MAP. Experiments have been conducted on a pre-record video with known ground truth. The results showed that by using the proposed approach, the execution time of the relocalization process was decreased for all selected VSLAM frameworks.

Full Text