Abstract

This paper aims at tackling the task of fusion feature from images and their corresponding point clouds for 3D object detection in autonomous driving scenarios based on AVOD, an Aggregate View Object Detection network. The proposed fusion algorithms fuse features targeted from Bird’s Eye View (BEV) LIDAR point clouds and their corresponding RGB images. Differing in existing fusion methods, which are simply the adoption of the concatenation module, the element-wise sum module or the element-wise mean module, our proposed fusion algorithms enhance the interaction between BEV feature maps and their corresponding image feature maps by designing a novel structure, where single level feature maps and utilize multilevel feature maps. Experiments show that our proposed fusion algorithm produces better results on 3D mAP and AHS with less speed loss compared to the existing fusion method used on the KITTI 3D object detection benchmark.

Highlights

  • It is a fact that deep neural networks rely on a large number of data to guarantee training effectiveness [1]

  • We evaluated our fusion algorithm in KITTI by 3D object detection with images and point clouds based on AVOD [16], a two-stage 3D object detector for autonomous driving scenarios on 3D

  • We proposed two fusion algorithms

Read more

Summary

Introduction

It is a fact that deep neural networks rely on a large number of data to guarantee training effectiveness [1]. The more data that is fed, the better the performance that will be obtained, when feeding abundant sensor data to the network model. In the field of self-driving cars or 3D object detection, the camera and lidar are dominant sensors. RGB images from cameras contain rich texture information of the ambience, whereas the depth is lost. Point clouds from lidar can provide accurate depth and reflection intensity descriptions, but the resolution is comparatively low. The effective fusion [2] of these sensors would be expected to deal with the drawbacks of a single sensor in complicated driving scenarios

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call