Abstract

Current single-stage 3D object detectors often use predefined single points of feature maps to generate confidence scores. However, the point feature not only lacks the boundaries and inner features but also does not establish an explicit association between regression box and confidence scores. In this paper, we present a novel single-stage object detector called keypoint-aware single-stage 3D object detector (KASSD). First, we design a lightweight location attention module (LLM), including feature reuse strategy (FRS) and location attention module (LAM). The FRS can facilitate the flow of spatial information. By considering the location, the LAM adopts weighted feature fusion to obtain efficient multi-level feature representation. To alleviate the inconsistencies mentioned above, we introduce a keypoint-aware module (KAM). The KAM can model spatial relationships and learn rich semantic information by representing the predicted object as a set of keypoints. We conduct experiments on the KITTI dataset. The experimental results show that our method has a competitive performance with 79.74% AP on a moderate difficulty level while maintaining 21.8 FPS inference speed.

Highlights

  • Nowadays, object detection has become a fundamental task of scene understanding, attracting much attention in various fields, such as autonomous vehicles and robotics

  • The tasks include traffic sign detection [1,2,3], traffic light detection [4,5], 2D object detection [6], and 3D objection detection [7,8], which rely on sensors installed on the autonomous vehicles

  • Our proposed keypoint-aware module (KAM) solves the problem that the relative position of the predefined anchor point and predicted bounding box is uncertain in the traditional 3D single-stage object detection

Read more

Summary

Introduction

Object detection has become a fundamental task of scene understanding, attracting much attention in various fields, such as autonomous vehicles and robotics. Since LiDAR (light detection and ranging) can provide accurate distance information about the surrounding environment and is not impacted under low-light conditions, it has become one of the main sources of perception. The purpose of 3D object detection of LiDAR point cloud is to predict the bounding box, classification, and direction, an essential job for downstream perception and planning tasks. 3D object detection methods based on deep learning have been widely adopted, and achieved dramatic developments in industry and academia [7]. Instead of learning feature for each point, volumetric-based methods encode point clouds into regular 3D grids, called voxels, so as to achieve robust representation and apply a Convolution Neural Network (CNN)

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.