Abstract

Semantic portrait synthesis has drawn consistent attention and has made significant progress, yet achieving style diversity and semantic controllability simultaneously is still a challenge. Existing methods either 1) directly take a semantic label map as input, ignoring various possibilities of semantic styles, or 2) sample global noise as input, ignoring controllability of local semantics. To fill this gap, we propose semantic-aware noise, a simple but effective input that tackles both issues and shows improved results over baselines. Semantic-aware noise introduces semantic information into noise, and each semantic is sampled from the noise separately, combining the semantic controllability and the noise sampling diversity. To further expand and manipulate real images, we propose a novel ternary network structure, allowing simultaneous diverse semantic image synthesis and real image manipulation in a unified framework. Extensive experiments demonstrate that the proposed method achieves quantitatively superior and perceptually pleasing results compared to state-of-the-art methods. We also analyze the performance of our method with respect to different noise structures and real-life applications in diverse synthesis, interactive manipulation, and extreme pose scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call