Lightweight convolutional neural network for real-time 3D object detection in road and railway environments

A Mauri,R Boutteau,M Haddad,R Khemmar,B Decoux

doi:10.1007/s11554-022-01202-6

Abstract

For smart mobility, and autonomous vehicles (AV), it is necessary to have a very precise perception of the environment to guarantee reliable decision-making, and to be able to extend the results obtained for the road sector to other areas such as rail. To this end, we introduce a new single-stage monocular real-time 3D object detection convolutional neural network (CNN) based on YOLOv5, dedicated to smart mobility applications for both road and rail environments. To perform the 3D parameter regression, we replace YOLOv5’s anchor boxes with our hybrid anchor boxes. Our method is available in different model sizes such as YOLOv5: small, medium, and large. The new model that we propose is optimized for real-time embedded constraints (lightweight, speed, and accuracy) that takes advantage of the improvement brought by split attention (SA) convolutions called small split attention model (Small-SA). To validate our CNN model, we also introduce a new virtual dataset for both road and rail environments by leveraging the video game Grand Theft Auto V (GTAV). We provide extensive results of our different models on both KITTI and our own GTAV datasets. Through our results, we show that our method is the fastest available 3D object detection with accuracy results close to state-of-the-art methods on the KITTI road dataset. We further demonstrate that the pre-training process on our GTAV virtual dataset improves the accuracy on real datasets such as KITTI, thus allowing our method to obtain an even greater accuracy than state-of-the-art approaches with 16.16% 3D average precision on hard car detection with inference time of 11.1 ms/image on an RTX 3080 GPU.

Highlights

YOLOv5 You only look once KITTI Karlsruhe Institute of Technology and Toyota technological institute LiDAR Light detection and ranging L-convolutional neural network (CNN) Lightweight-CNN CARLA CAR learning to act ROAD ROad event awareness dataset SYNTHIA SYNTHetic collection of Imagery and Annotations R-CNN Region-based CNN LSTM Long short term memory GAM3D Ground-aware monocular 3D object keypoint feature pyramid network (KFPN) Keypoint feature pyramid network M3D-region-proposal network (RPN) Monocular 3D region proposal network split attention (SA) Split attention API Application programming interface IOU Intersection over union Average Precision (AP) Average precision ToF Time-of-flight Dimension Score (DS) Dimension score Center Score (CS) Center score
We propose to fill these gaps, lack of railway dataset with ground truth data and real-time 3D object detection, by proposing a new virtual dataset (GTAV) dedicated to both road and railway environments for 3D object detection, which includes images taken from the point of view of both cars and trains
We demonstrate that pre-training our model on our dataset significantly improves the accuracy of our method on the KITTI dataset

Summary

Introduction

With the rise of Deep Learning approaches for computer vision tasks and the emergence of CNNs, it has become possible for cameras to be used for object detection, depth estimation, tracking, instance, and semantic segmentation, etc These kinds of methods have many applications, especially for smart mobility to enhance safety and increase vehicle autonomy. We propose to fill these gaps, lack of railway dataset with ground truth data and real-time 3D object detection, by proposing a new virtual dataset (GTAV) dedicated to both road and railway environments for 3D object detection, which includes images taken from the point of view of both cars and trains.

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Real-Time Image Processing	Publication Date: Feb 11, 2022
Citations: 13	License type: open-access

R Discovery Prime

R Discovery Prime

Lightweight convolutional neural network for real-time 3D object detection in road and railway environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Real-Time Image Processing

Lead the way for us

Similar Papers

Real-time 3D object detection in unstructured environments
Wang Rui ... Liang Ying
-
Wang Rui, et. al.Wang Rui ... Liang Ying
01 Jun 2017
01 Jun 2017

Real-time 3D Object Detection Using Improved Convolutional Neural Network Based on Image-driven Point Cloud
Zhiyong Gao ... Jianhong Xiang
(Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 14
Zhiyong Gao, et. al.Zhiyong Gao ... Jianhong Xiang
23 Dec 2021
(Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 14

CVFNet: Real-time 3D Object Detection by Learning Cross View Features
Jiaqi Gu ... Zhiyu Xiang
-
Jiaqi Gu, et. al.Jiaqi Gu ... Zhiyu Xiang
23 Oct 2022
23 Oct 2022

SARPNET: Shape attention regional proposal network for liDAR-based 3D object detection
Yangyang Ye ... Zhaoxiang Zhang
Neurocomputing | VOL. 379
Yangyang Ye, et. al.Yangyang Ye ... Zhaoxiang Zhang
17 Oct 2019
Neurocomputing | VOL. 379

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lightweight convolutional neural network for real-time 3D object detection in road and railway environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Real-Time Image Processing