Abstract

Most current research on dynamic visual Simultaneous Localization and Mapping (SLAM) systems focuses on scenes where static objects occupy most of the environment. However, in densely populated indoor environments, the movement of the crowd can lead to the loss of feature information, thereby diminishing the system’s robustness and accuracy. This paper proposes a visual SLAM algorithm for dense crowd environments based on a combination of the ORB-SLAM2 framework and RGB-D cameras. Firstly, we introduced a dedicated target detection network thread and improved the performance of the target detection network, enhancing its detection coverage in crowded environments, resulting in a 41.5% increase in average accuracy. Additionally, we found that some feature points other than humans in the detection box were mistakenly deleted. Therefore, we proposed an algorithm based on standard deviation fitting to effectively filter out the features. Finally, our system is evaluated on the TUM and Bonn RGB-D dynamic datasets and compared with ORB-SLAM2 and other state-of-the-art visual dynamic SLAM methods. The results indicate that our system’s pose estimation error is reduced by at least 93.60% and 97.11% compared to ORB-SLAM2 in high dynamic environments and the Bonn RGB-D dynamic dataset, respectively. Our method demonstrates comparable performance compared to other recent visual dynamic SLAM methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call