Abstract

Person image generation is a challenging task aimed to transfer the person of the source image from a source pose to a target pose while preserving its style. In this paper, we proposed a Generative Adversarial Network based on Decoupled Semantic Attention Transfer (DSAT-GAN), focusing on that local semantic representations of different image styles and contents cannot be accurately decoupled and transferred. This architecture employs a novel Multi-scale Semantic Mapping Generation Network (Ms-SMGN), driven by two network modules with different semantic attention mechanism, aiming to accurately align and transfer the representations of local semantics at different spatial scales. Then, a channel-separated convolution is applied in the encoding networks instead of the traditional channel fully-connected operation, which reduces computational complexity while realizing channel semantic decoupling. Moreover, a Gram matrix-based global style loss is introduced to further enhance the consistency of high-level semantic between generated and target images. Experiments on Market-1501 and DeepFashion datasets show that DSAT-GAN has superior performance compared with other recent baselines. Additionally, this architecture can be extended to the data enhancement scenes to significantly improve the accuracy of person Re-identification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call