Virtual to Real Adaptation of Pedestrian Detectors.

Luca Ciampi,Fabrizio Falchi,Nicola Messina,Claudio Gennaro,Giuseppe Amato

doi:10.3390/s20185250

Luca Ciampi, Fabrizio Falchi + Show 3 more

Open Access

https://doi.org/10.3390/s20185250

Copy DOI

Abstract

Pedestrian detection through Computer Vision is a building block for a multitude of applications. Recently, there has been an increasing interest in convolutional neural network-based architectures to execute such a task. One of these supervised networks’ critical goals is to generalize the knowledge learned during the training phase to new scenarios with different characteristics. A suitably labeled dataset is essential to achieve this purpose. The main problem is that manually annotating a dataset usually requires a lot of human effort, and it is costly. To this end, we introduce ViPeD (Virtual Pedestrian Dataset), a new synthetically generated set of images collected with the highly photo-realistic graphical engine of the video game GTA V (Grand Theft Auto V), where annotations are automatically acquired. However, when training solely on the synthetic dataset, the model experiences a Synthetic2Real domain shift leading to a performance drop when applied to real-world images. To mitigate this gap, we propose two different domain adaptation techniques suitable for the pedestrian detection task, but possibly applicable to general object detection. Experiments show that the network trained with ViPeD can generalize over unseen real-world scenarios better than the detector trained over real-world data, exploiting the variety of our synthetic dataset. Furthermore, we demonstrate that with our domain adaptation techniques, we can reduce the Synthetic2Real domain shift, making the two domains closer and obtaining a performance improvement when testing the network over the real-world images.

Highlights

A key task in many intelligent video surveillance systems is pedestrian detection, as it provides essential information for semantic understanding of video
We introduce and make publicly available ViPeD, a new vast synthetic dataset suitable for the pedestrian detection task, generating the images using photo-realistic video game GTA V
We addressed the pedestrian detection task by proposing a Convolutional Neural Networks (CNNs)-based solution trained using synthetically generated data

Summary

Introduction

A key task in many intelligent video surveillance systems is pedestrian detection, as it provides essential information for semantic understanding of video. Since manually annotating new collections of images is expensive and requires a great human effort, a recently promising approach is to gather data from virtual world environments that mimics as much as possible all the characteristics of the real-world scenarios, and where the annotations can be acquired with a partially automated process To this end, in this work, we provide ViPeD (Virtual Pedestrian Dataset), a new synthetic dataset generated with the highly photo-realistic graphical engine of the video game GTA V (Grand Theft Auto V) by Rockstar. We introduce and make publicly available ViPeD, a new vast synthetic dataset suitable for the pedestrian detection task, generating the images using photo-realistic video game GTA V (Grand Theft Auto V), that extends the JTA (Joint Track Auto) dataset presented in [9]. The code, the models, and the dataset are made freely available at https://ciampluca.github.io/viped/

Pedestrian Detection

Synthetic2Real Domain Adaptation

Training with Synthetic Datasets

Domain Adaptation for Synthetic2Real Pedestrian Detection

Faster R-CNN Object Detector

Domain Adaptation Using Real-World Fine-Tuning

Domain Adaptation using Balanced Gradient Contribution

Experimental Evaluation

Real-World Datasets

Experiments

Testing Generalization Capabilities

Testing Domain Adaptation Techniques over Specific Real-World Scenarios

Method

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Sep 14, 2020
Citations: 65	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Virtual to Real Adaptation of Pedestrian Detectors.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Dynamic Domain Adaptation for Single-view 3D Reconstruction
Cong Yang ... Housen Xie
-
Cong Yang, et. al.Cong Yang ... Housen Xie
27 Sep 2021
27 Sep 2021

Subspace Distribution Adaptation Frameworks for Domain Adaptation.
Sentao Chen ... Xiaowei Yang
IEEE Transactions on Neural Networks and Learning Systems | VOL. 31
Sentao Chen, et. al.Sentao Chen ... Xiaowei Yang
30 Nov 2020
IEEE Transactions on Neural Networks and Learning Systems | VOL. 31

Transforming a research-oriented dataset for evaluation of tactical information extraction technologies
Joanne Knight ... Heather Roy
-
Joanne Knight, et. al.Joanne Knight ... Heather Roy
12 May 2016
12 May 2016

Self-supervised cycle-consistent learning for scale-arbitrary real-world single image super-resolution
Honggang Chen ... Ray E Sheriff
Expert Systems with Applications | VOL. 212
Honggang Chen, et. al.Honggang Chen ... Ray E Sheriff
28 Aug 2022
Expert Systems with Applications | VOL. 212

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Virtual to Real Adaptation of Pedestrian Detectors.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors