“Focusing on the right regions” — Guided saliency prediction for visual SLAM

Sheng Jin,Xuyang Dai,Qinghao Meng

doi:10.1016/j.eswa.2022.119068

Abstract

Features play an important role in achieving robust visual simultaneous localization and mapping (SLAM) in complex environments. Although all scene features provide a certain amount of information, their importance to SLAM is different. Similar to the human attention mechanism, close attention should be paid to features in salient and important regions. Therefore, this paper proposes a saliency prediction-based SLAM (SP-SLAM), which represents a visual SLAM system that combines the ORB-SLAM3 with a saliency prediction model. The proposed combined saliency prediction model focuses on the right regions by considering geometric, semantic, and depth information, thus making visual SLAM more accurate. Moreover, a multi-level strategy is introduced to make the saliency prediction model continuously focus on the same regions, which can learn the temporally consistent information between adjacent images. Then, the predicted saliency map is used to provide salient weights for robust tracking and optimization to improve the accuracy of visual SLAM. Finally, comprehensive test results show that the proposed SP-SLAM has superior performance in terms of localization accuracy and saliency prediction performance.

Full Text