Abstract

Person image generation is a challenging task aimed to transfer the person of the source image from a source pose to a target pose while preserving its style. In this paper, we proposed a Generative Adversarial Network based on Decoupled Semantic Attention Transfer (DSAT-GAN), focusing on that local semantic representations of different image styles and contents cannot be accurately decoupled and transferred. This architecture employs a novel Multi-scale Semantic Mapping Generation Network (Ms-SMGN), driven by two network modules with different semantic attention mechanism, aiming to accurately align and transfer the representations of local semantics at different spatial scales. Then, a channel-separated convolution is applied in the encoding networks instead of the traditional channel fully-connected operation, which reduces computational complexity while realizing channel semantic decoupling. Moreover, a Gram matrix-based global style loss is introduced to further enhance the consistency of high-level semantic between generated and target images. Experiments on Market-1501 and DeepFashion datasets show that DSAT-GAN has superior performance compared with other recent baselines. Additionally, this architecture can be extended to the data enhancement scenes to significantly improve the accuracy of person Re-identification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.