Abstract

Significant progress has been made on methods for generating images from structured semantic descriptions, but the generated images only retain semantic information, and the appearance of objects cannot be constrained and effectively represented. Therefore, we propose a scene graph structure image generation method assisted by object edge information. Our model uses two graph convolution neural networks(GCN) to process scene graphs and obtains object features as well as relation features which aggregate related information. The object bounding boxes are predicted by a method a decoupling the size and position. Where auxiliary models are added to coordinate with segmentation mask network training. Our experiments show that the introduction of object edges provides clearer object appearance information for image generation, which can constrain object shapes and improve image quality greatly. Finally, the cascaded refinement network is used to generate images. Additionally, compared with other appearance features, such as object slices, edge information occupies a smaller quantity of data, which greatly improves the image quality with less increase in the input information. This feature also benefits semantic communication systems. A large number of experiments show that our method is significantly superior to the latest Sg2im method when evaluated on Visual Genome datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call