Abstract
Cross-domain image translation studies have shown brilliant progress in recent years, which intend to learn the mapping between two different domains. A good cross-domain image translation model should meet the following conditions: (1) do not rely on paired dataset, (2) can deal with multiple domains, (3) obtain diverse outputs with the same source image. Most state-of-art studies are devoted to addressing two of them i.e., either (1) and (2), or (1) and (3). In this paper, we construct a unified diverse multi-domain image to image translation framework (DMDIT) which can satisfy the above three requirements simultaneously. Different from traditional approaches, the proposed generator can achieve diverse and multi-label image-to-image translation while retaining the underlying features of the input image. The diverse outputs are obtained through a latent noise sampled from the normal distribution randomly. To further improve the multiplicity of the outputs, we propose a novel style regularization loss to restrain the latent noise. The mode collapse problem usually occurs due to the lack of constraints on the noise, so we embed a noise separation module in the discriminator to avoid this issue. In addition, we apply an attention mechanism to make the model attentively focus on the most attribute-relevant regions, helping to improve the quality of the generated images. Extensive qualitative and quantitative evaluations clearly demonstrate the effectiveness of our approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.