Abstract

AbstractImage synthesis designed for machine learning applications provides the means to efficiently generate large quantities of training data while controlling the generation process to provide the best distribution and content variety. With the demands of deep learning applications, synthetic data have the potential of becoming a vital component in the training pipeline. Over the last decade, a wide variety of training data generation methods has been demonstrated. The potential of future development calls to bring these together for comparison and categorization. This survey provides a comprehensive list of the existing image synthesis methods for visual machine learning. These are categorized in the context of image generation, using a taxonomy based on modelling and rendering, while a classification is also made concerning the computer vision applications they are used. We focus on the computer graphics aspects of the methods, to promote future image generation for machine learning. Finally, each method is assessed in terms of quality and reported performance, providing a hint on its expected learning potential. The report serves as a comprehensive reference, targeting both groups of the applications and data development sides. A list of all methods and papers reviewed herein can be found at https://computergraphics.on.liu.se/image_synthesis_methods_for_visual_machine_learning/.

Highlights

  • We are currently witnessing a strong trend in the use of machine learning (ML), through deep learning (DL) [LBH15, GBC16]

  • To distinguish between the image synthesis methods used in ML, we provide a taxonomy that considers the image generation perspective

  • The recent advances in ML, and in DL, have made it clear that the development of efficient algorithms to a large extent relies on the available training data fed to the learning algorithm

Read more

Summary

Introduction

We are currently witnessing a strong trend in the use of machine learning (ML), through deep learning (DL) [LBH15, GBC16]. Today the bottleneck is often caused by the limited availability and quality of training data [RHW18]. Data acquisition is a limiting factor, due to the actual capturing process, but most often because annotations for supervised learning can be expensive and prohibitively time-consuming to generate. It is difficult to cover all possible situations that are relevant These problems have made it crucial to make the most of the available training data, and augmentation techniques for various purposes (generalization, domain adaptation, adversarial robustness, regularization, etc.) is today an essential step in the DL pipeline [SK19, PW17]. While data augmentation can be thought of as a synthetic data generation process, the synthesized samples are bound by the data at hand. The recent developments in learning-based generative modelling, and how image synthesis relates to methods for data augmentation, are described. The interest has since expanded to a diverse set of applications, including advertisements, medicine, virtual reality, science and engineering

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.