Abstract

Regional facial image synthesis conditioned on a semantic mask has achieved great attention in the field of computational visual media. However, the appearances of different regions may be inconsistent with each other after performing regional editing. In this paper, we focus on harmonized regional style transfer for facial images. A multi-scale encoder is proposed for accurate style code extraction. The key part of our work is a multi-region style attention module. It adapts multiple regional style embeddings from a reference image to a target image, to generate a harmonious result. We also propose style mapping networks for multi-modal style synthesis. We further employ an invertible flow model which can serve as mapping network to fine-tune the style code by inverting the code to latent space. Experiments on three widely used face datasets were used to evaluate our model by transferring regional facial appearance between datasets. The results show that our model can reliably perform style transfer and multi-modal manipulation, generating output comparable to the state of the art.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.