Weakly Supervised 3D Object Detection (WS3DOD) aims to perform 3D object detection with little reliance on 3D labels, which greatly reduces the cost of 3D annotations. In recent literature, the pseudo-label-based approach brings impressive performance, which generates 3D pseudo-labels from 2D bounding boxes. Despite their success, two key issues remain unresolved that reduce the quality of 3D pseudo-labels: 1) the existing local object locating algorithm can not capture complete clusters of points globally, and 2) the existing algorithm can not capture sparse points caused by the unevenly distributed points obtained by LiDAR cameras. Hence, we propose GAL, a Graph-induced Adaptive Learning algorithm, to generate 3D pseudo-labels. First, we propose the Cluster Locating algorithm based on the Minimum Spanning Tree (MST) to globally locate the objects, which can leverage the characteristic that points inside an object are compact while points between objects are discrete. Second, we propose a density-guided adaptive learning algorithm to optimise the Cluster Locating algorithm, named Cuboid Drift. Cuboid Drift considers the inhomogeneous distribution of reflected points on different reflective surfaces of LiDAR imaging. Finally, 3D pseudo-labels generated by GAL are leveraged to train 3D detectors. Extensive experiments on the challenging <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">KITTI </i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DAIR-V2X-V</i> dataset demonstrate that GAL without 3D labels can be comparable with strongly supervised approaches and outperforms the previous state-of-the-art WS3DOD methods. Moreover, our method saves 88% of the time spent on pseudo-label generation.
Read full abstract