Abstract

In the field of autonomous driving, precise spatial positioning and 3D object detection have become increasingly critical due to advancements in LiDAR technology and its extensive applications. Traditional detection models for RGB images face challenges in handling the intrinsic disorder present in LiDAR point clouds. Although point clouds are typically perceived as irregular and disordered, an implicit order actually exists, owing to laser arrangement and sequential scanning. Therefore, we propose Frustumformer, a novel framework that leverages the inherent order of LiDAR point clouds, reducing disorder and enhancing representation. Our approach consists of a frustum-based method that relies on the results of a 2D image detector, a frustum patch embedding that exploits the new data representation format, and a single-stride transformer network for original resolution feature fusion. By incorporating these components, Frustumformer effectively exploits the intrinsic order of point clouds and models long-range dependencies to further improve performance. Ablation studies verify the efficacy of the single-stride transformer component and the overall model architecture. We conduct experiments on the KITTI dataset, and Frustumformer outperforms existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call