Abstract

AbstractThe image‐to‐image translation (I2IT) task aims to transform images from the source domain into the specified target domain. State‐of‐the‐art CycleGAN‐based translation algorithms typically use cycle consistency loss and latent regression loss to constrain translation. In this work, it is demonstrated that the model parameters constrained by the cycle consistency loss and the latent regression loss are equivalent to optimizing the medians of the data distribution and the generative distribution. In addition, there is a style bias in the translation. This bias interacts between the generator and the style encoder and visually exhibits translation errors, e.g. the style of the generated image is not equal to the style of the reference image. To address these issues, a new I2IT model termed high‐quality‐I2IT (HQ‐I2IT) is proposed. The optimization scheme is redesigned to prevent the model from optimizing the median of the data distribution. In addition, by separating the optimization of the generator and the latent code estimator, the redesigned model avoids error interactions and gradually corrects errors during training, thereby avoiding learning the median of the generated distribution. The experimental results demonstrate that the visual quality of the images produced by HQ‐I2IT is significantly improved without changing the generator structure, especially when guided by the reference images. Specifically, the Fréchet inception distance on the AFHQ and CelebA‐HQ datasets are reduced from 19.8 to 10.2 and from 23.8 to 17.0, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call