Abstract

Weakly supervised 3D object detection for autonomous driving primarily focuses on cars because of their distinct rectangle boundaries and abundant instances. However, detecting categories with ambiguous rectangle boundaries and fewer instances than cars, such as pedestrians and cyclists, remains challenging with limited research. Ambiguity in rectangle boundaries presents significant difficulties in generating accurate 3D pseudo labels, while the scarcity of instances often leads to convergence issues during detector training. Pedestrians and cyclists are dense inside the 3D bounding boxes but sparse at corners and boundaries. Density is a practical clue for locating and discriminating pedestrians and cyclists in point clouds. This paper proposes a density-based 3D pseudo-label generation module (DPL-3D), addressing the challenges of ambiguous rectangle boundaries. Ambiguity rectangle boundaries will lead to poor pseudo-label quality. Therefore, By leveraging the density information of 3D points, our DPL-3D improves the accuracy and localization quality of the generated pseudo labels. It effectively segments background points, improving the estimation of pseudo labels’ location, dimension, and orientation. Few training samples always lead to local optima. Introducing multi-modal data in the detector network could enhance the constraints of objects’ features, but 2D images and 3D point clouds have a resolution gap. A motivation for dealing with the resolution gap is that neighboring regions with similar colors and textures in 2D images may exhibit spatial proximity in 3D space. Therefore, a multi-modal network driven by superpixel segmentation is introduced. This network enables effective discrimination between objects in 2D images and 3D point clouds, bridging the resolution gap and leveraging complementary features from both modalities. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed methods in addressing the challenges associated with weakly-supervised 3D object detection, particularly for categories with ambiguous rectangle boundaries and few instances.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call