Abstract

Abstract. Nowadays, digitizing roadside objects, for instance traffic signs, is a necessary step for generating High Definition Maps (HD Map) which remains as an open challenge. Rapid development of deep learning technology using Convolutional Neural Networks (CNN) has achieved great success in computer vision field in recent years. However, performance of most deep learning algorithms highly depends on the quality of training data. Collecting the desired training dataset is a difficult task, especially for roadside objects due to their imbalanced numbers along roadside. Although, training the neural network using synthetic data have been proposed. The distribution gap between synthetic and real data still exists and could aggravate the performance. We propose to transfer the style between synthetic and real data using Multi-Task Generative Adversarial Networks (SYN-MTGAN) before training the neural network which conducts the detection of roadside objects. Experiments focusing on traffic signs show that our proposed method can reach mAP of 0.77 and is able to improve detection performance for objects whose training samples are difficult to collect.

Highlights

  • In recent years, images, including panoramic images, and point cloud collected by Mobile Mapping System (MMS) are used to generate HD maps, which can be applied to autonomous driving, smart city, etc

  • Benefit from advances that deep learning technology has achieved in recent years, several have been proposed to extract objects of interest from images or point clouds using Convolutional Neural Networks (CNN) based algorithms. (Wolf et al, 2019) proposed a method to detect manholes and road markings by semantic segmentation using images rendered from point clouds

  • We proposed a multi-task Generative Adversarial Networks (GAN) architecture, SYNMTGAN, to generate fake images from synthetic scene images, which can be used as training data for an object detector such as Faster R-CNN when the target objects in the real scene are hard to be collected

Read more

Summary

Introduction

Images, including panoramic images, and point cloud collected by Mobile Mapping System (MMS) are used to generate HD maps, which can be applied to autonomous driving, smart city, etc. Benefit from advances that deep learning technology has achieved in recent years, several have been proposed to extract objects of interest from images or point clouds using CNN based algorithms. CNN based algorithms are highly dependent on the quality of training dataset, whose creation is both timeconsuming and costly. 70 categories of traffic signs can hardly collect enough samples to train a CNN model. The rest 20 categories of traffic signs contribute more than 90% samples of the dataset. This is known as a long-tail phenomenon which could cause a significant performance drop

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.