Abstract

3D object detection based on LiDAR point cloud has wide applications in autonomous driving and robotics. Recently, many approaches use voxelization representation in feature extraction and apply 3D convolution neural networks for 3D object detection. How to get expressive 3D voxelization representation is important for the detection performance. Therefore, we propose a new 3D object detection framework (DVFENet) based on dual-branch voxel feature extraction, which can provide rich and complete 3D information. The first branch is a graph-attention-network-based voxel feature extraction, which applies an improved voxel graph attention feature extractor (VGAFE) on large-scale voxelization. This branch uses graph convolution networks with an attention mechanism to extract more local neighborhood and context information. The second branch is a 3D-sparse-convolution-based voxel feature extraction that captures finer geometric features based on small-scale voxelization. We also design a decoupled RPN module that can obtain task-specific features to reduce the task conflict. Experiments on the challenging KITTI 3D object detection benchmark and nuScenes detection task show that our method achieve good performance. At the same time, we conduct extensive experiments to verify the effectiveness of each component.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call