3D Object Detection Based on Attention and Multi-Scale Feature Fusion.

Minghui Liu,Qiuping Zheng,Jinming Ma,Yuchen Liu,Gang Shi

doi:10.3390/s22103935

Minghui Liu, Qiuping Zheng + Show 3 more

Open Access

https://doi.org/10.3390/s22103935

Copy DOI

Journal: Sensors (Basel, Switzerland)	Publication Date: May 23, 2022
Citations: 11	License type: CC BY 4.0

Affiliation: Xinjiang University

Abstract

Three-dimensional object detection in the point cloud can provide more accurate object data for autonomous driving. In this paper, we propose a method named MA-MFFC that uses an attention mechanism and a multi-scale feature fusion network with ConvNeXt module to improve the accuracy of object detection. The multi-attention (MA) module contains point-channel attention and voxel attention, which are used in voxelization and 3D backbone. By considering the point-wise and channel-wise, the attention mechanism enhances the information of key points in voxels, suppresses background point clouds in voxelization, and improves the robustness of the network. The voxel attention module is used in the 3D backbone to obtain more robust and discriminative voxel features. The MFFC module contains the multi-scale feature fusion network and the ConvNeXt module; the multi-scale feature fusion network can extract rich feature information and improve the detection accuracy, and the convolutional layer is replaced with the ConvNeXt module to enhance the feature extraction capability of the network. The experimental results show that the average accuracy is 64.60% for pedestrians and 80.92% for cyclists on the KITTI dataset, which is 1.33% and 2.1% higher, respectively, compared with the baseline network, enabling more accurate detection and localization of more difficult objects.

Full Text