Abstract
Article Localization and Mapping Method Based on Multimodal Information Fusion and Deep Learning for Dynamic Object Removal Chong Ma, Peng Cheng, and Chenxiao Cai * School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China * Correspondence: ccx5281@njust.edu.cn Received: 10 September 2023 Accepted: 25 December 2023 Published: 26 June 2024 Abstract: A simultaneous localization and mapping (SLAM) system is presented in this paper based on visual-inertial fusion to solve the pose estimation drift problem caused by weak texture environments or rapid robot movements. The camera and inertial measurement unit (IMU) is initialized through IMU pre-integration and visual front-end processing, and a tightly coupled residual function model is employed in the back-end to eliminate accumulated errors. To realize the real-time pose estimation in the complex loop scene, the sliding window optimization method based on the marginalization strategy is adopted to improve the optimization efficiency of the system, and the loop detection algorithm based on the bag-of-words model is exploited to solve the cumulative error problem generated during long-term operation. Furthermore, because of the interference (of complex scenes with dynamic targets) in system modeling and localization of the environment, this paper introduces a deep-learning semantic segmentation model to segment and eliminate dynamic targets. The system performance test is carried out based on the EuRoC dataset and the KITTI dataset. Finally, the experimental results illustrate that the proposed method has improved system robustness and localization accuracy compared with the pure vision algorithm and the visual-inertial fusion algorithm without removing dynamic targets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have