SAC-GAN: Structure-Aware Image Composition.

Hang Zhou,Rui Ma,Ling-Xiao Zhang,Ali Mahdavi-Amiri,Lin Gao,Hao Zhang

doi:10.1109/tvcg.2022.3226689

Abstract

We introduce an end-to-end learning framework for image-to-image composition, aiming to plausibly compose an object represented as a cropped patch from an object image into a background scene image. As our approach emphasizes more on semantic and structural coherence of the composed images, rather than their pixel-level RGB accuracies, we tailor the input and output of our network with structure-aware features and design our network losses accordingly, with ground truth established in a self-supervised setting through the object cropping. Specifically, our network takes the semantic layout features from the input scene image, features encoded from the edges and silhouette in the input object patch, as well as a latent code as inputs, and generates a 2D spatial affine transform defining the translation and scaling of the object patch. The learned parameters are further fed into a differentiable spatial transformer network to transform the object patch into the target image, where our model is trained adversarially using an affine transform discriminator and a layout discriminator. We evaluate our network, coined SAC-GAN, for various image composition scenarios in terms of quality, composability, and generalizability of the composite images. Comparisons are made to state-of-the-art alternatives, including Instance Insertion, ST-GAN, CompGAN and PlaceNet, confirming superiority of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SAC-GAN: Structure-Aware Image Composition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on visualization and computer graphics

Lead the way for us

Journal: IEEE transactions on visualization and computer graphics	Publication Date: Jul 1, 2024
Citations: 1

Similar Papers

Interpretation of Latent Codes in InfoGAN with SAR Images
Zhenpeng Feng ... Xianda Zhou
Remote Sensing | VOL. 15
Zhenpeng Feng, et. al.Zhenpeng Feng ... Xianda Zhou
24 Feb 2023
Remote Sensing | VOL. 15

Author response: THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior
Martin N Hebart ... Charles Y Zheng
-
Martin N Hebart, et. al.Martin N Hebart ... Charles Y Zheng
24 Jan 2023
24 Jan 2023

Editor's evaluation: THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior
Morgan Barense
-
Morgan BarenseMorgan Barense
26 Oct 2022
26 Oct 2022

Decision letter: THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior
Talia Konkle ... Floris P de Lange
-
Talia Konkle, et. al.Talia Konkle ... Floris P de Lange
26 Oct 2022
26 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SAC-GAN: Structure-Aware Image Composition.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on visualization and computer graphics