Real-time on-board pedestrian detection using generic single-stage algorithms and on-road databases

Vicent Ortiz Castelló,Juan-Carlos Perez-Cortes,Omar Del Tejo Catalá,Ismael Salvador Igual

doi:10.1177/1729881420929175

Vicent Ortiz Castelló, Juan-Carlos Perez-Cortes + Show 2 more

Open Access

PDF Available

https://doi.org/10.1177/1729881420929175

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Pedestrian detection is a particular case of object detection that helps to reduce accidents in advanced driver-assistance systems and autonomous vehicles. It is not an easy task because of the variability of the objects and the time constraints. A performance comparison of object detection methods, including both GPU and non-GPU implementations over a variety of on-road specific databases, is provided. Computer vision multi-class object detection can be integrated on sensor fusion modules where recall is preferred over precision. For this reason, ad hoc training with a single class for pedestrians has been performed and we achieved a significant increase in recall. Experiments have been carried out on several architectures and a special effort has been devoted to achieve a feasible computational time for a real-time system. Finally, an analysis of the input image size allows to fine-tune the model and get better results with practical costs.

Highlights

Object detection is a central problem in Computer Vision
Pedestrian detection constitutes one of the most challenging tasks to perform in terms of on-road object detection for two main reasons
Generic-class databases, such as COCO,[20] are the starting point to develop general detection algorithms, we focused our effort on specific on-road and human image databases

Summary

Introduction

Object detection is a central problem in Computer Vision. Its goal is to detect the location and class of each object in images or image sequences. Two-stage algorithms predict detections in two phases: they use spatial features at pixel level to extract some Regions of Interest, and use a second phase to classify all the proposals to decide if each region is a pedestrian or not These methods usually produce better detection results, but they are more computationally expensive,[16] being less used in real-time (RT) detection tasks because of the limited computational power of most of the resource-constrained devices usually installed on-board. NuScenes is a very novel, public large-scale dataset for autonomous driving It includes data from the full sensor suite of a self-driving car (RADAR, LiDAR, cameras, IMU and GPS), with more than 1.4 million camera images, and it provides manually labelled annotations for 23 classes, including VRUs. the databases used are joined to obtain a complete dataset that tries to represent as much variability as possible including different image sizes and aspect ratios, weather conditions, cities and roads, and a wide range of light conditions (see Figure 1). This allows the architecture to run in RT even on non-GPU resourceconstrained systems

Experiments

Findings

Conclusions and future work