Abstract

Image-to-image translation based on deep learning has attracted interest in the robotics and vision community because of its potential impact on terrain analysis and image representation, interpretation, modification, and enhancement. Currently, the most successful approach for generating a translated image is a conditional generative adversarial network (cGAN) for training an autoencoder with skip connections. Despite its impressive performance, it has low accuracy and a lack of consistency; further, its training is imbalanced. This paper proposes a balanced training strategy for image-to-image translation, resulting in an accurate and consistent network. The proposed approach uses two generators and a single discriminator. The generators translate images from one domain to another. The discriminator takes the input of three different configurations and guides both the generators to generate realistic images in their corresponding domains while ensuring high accuracy and consistency. Experiments are conducted on different datasets. In particular, the proposed approach outperforms the cGAN in realistic image translation in terms of accuracy and consistency in training.

Highlights

  • Translating images between different domains has many important applications in the field of robotics and computer vision, including terrain shape estimation, tip-over and collision avoidance, scene understanding, image colorization, styling, de-noising, and modification

  • After training the proposed consistent image-to-image translation network for translating images from one domain to another by using different datasets, we evaluated its performance in terms of realistic translation between different domains

  • We proposed a consistent image-to-image translation network to address the problem of inconsistency in conditional generative adversarial network (cGAN) [10]

Read more

Summary

Introduction

Translating images between different domains has many important applications in the field of robotics and computer vision, including terrain shape estimation, tip-over and collision avoidance, scene understanding, image colorization, styling, de-noising, and modification. Translating images between different domains requires semantic knowledge about pairwise embedding by exploiting natural correspondences. The correspondence between different domains can be categorized considering different aspects and problems. Such relations are naturally recognized by humans. There is a natural relationship between an RGB image and its corresponding depth map, between edge-based representation and its real image correspondence, and between an aerial image and a map image. We explore this image-to-image translation task as a problem of translating image representation from one domain to the corresponding domain, given sufficient training data

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call