Autonomous navigation in dynamic environments presents a significant challenge for mobile robotic systems. This paper proposes a novel approach utilizing Convolutional Neural Networks (CNNs) for multi-object detection in 3D space and 2D segmentation using bird’s eye view (BEV) maps derived from 3D Light Detection and Ranging (LiDAR) data. Our method aims to enable mobile robots to localize movable objects and their occupancy, which is crucial for safe and efficient navigation. To address the scarcity of labeled real-world datasets, a synthetic dataset based on a simulation environment is generated to train and evaluate our model. Additionally, we employ a subset of the NVIDIA r2b dataset for evaluation in the real world. Furthermore, we integrate our CNN-based detection and segmentation model into a Robot Operating System 2 (ROS2) framework, facilitating communication between mobile robots and a centralized node for data aggregation and map creation. Our experimental results demonstrate promising performance, showcasing the potential applicability of our approach in future assembly systems. While further validation with real-world data is warranted, our work contributes to advancing perception systems by proposing a solution for multi-source, multi-object tracking and mapping.
Read full abstract