Abstract

Three-dimensional object detection is vital for understanding the autonomous vehicle driving environment. Different sensors are used for this purpose, such as cameras and LiDARs. Camera sensors are rich in color and texture information. However, cameras are unsuitable for 3D object detection due to the lack of depth information. Additionally, camera sensors are vulnerable to bad weather, such as snow, fog, and night driving. Autonomous driving needs a fast and accurate perception system for robust operation of the following pipeline, such as path planning and control. LiDAR is a commonly used sensor for 3D object detection because of its 3D information. However, its lack of color and texture information reduces the classification and detection performance. Therefore, there is no complete sensor that works for all situations. In this work, we propose a multi-modal fusion 3D object detection model for autonomous driving to use the best out of LiDAR and camera sensors. The model comprises a feature extraction network and a fusion network. Feature extraction networks transform the image and point cloud data into high-level features before fusion. Then, features from images and LiDAR data are fused. The experimental result on the nuScenes dataset shows the model’s competitive performance for 3D object detection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call