Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Siyu Zou,Chaoyi Zhao,Jiji Tang,Xiaoshuai Sun,Zhipeng Hu,Yiyi Zhou,Jing He,Rongsheng Zhang

doi:10.1609/aaai.v38i7.28622

Abstract

Diffusion-based Image Editing (DIE) is an emerging research hot-spot, which often applies a semantic mask to control the target area for diffusion-based editing. However, most existing solutions obtain these masks via manual operations or off-line processing, greatly reducing their efficiency. In this paper, we propose a novel and efficient image editing method for Text-to-Image (T2I) diffusion models, termed Instant Diffusion Editing (InstDiffEdit). In particular, InstDiffEdit aims to employ the cross-modal attention ability of existing diffusion models to achieve instant mask guidance during the diffusion steps. To reduce the noise of attention maps and realize the full automatics, we equip InstDiffEdit with a training-free refinement scheme to adaptively aggregate the attention distributions for the automatic yet accurate mask generation. Meanwhile, to supplement the existing evaluations of DIE, we propose a new benchmark called Editing-Mask to examine the mask accuracy and local editing ability of existing methods. To validate InstDiffEdit, we also conduct extensive experiments on ImageNet and Imagen, and compare it with a bunch of the SOTA methods. The experimental results show that InstDiffEdit not only outperforms the SOTA methods in both image quality and editing results, but also has a much faster inference speed, i.e., +5 to +6 times. Our code available at https://anonymous.4open.science/r/InstDiffEdit-C306

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

PFB-Diff: Progressive Feature Blending diffusion for text-driven image editing
Wenjing Huang ... Lei Xu
Neural Networks | VOL. 181
Wenjing Huang, et. al.Wenjing Huang ... Lei Xu
09 Oct 2024
Neural Networks | VOL. 181

BARET: Balanced Attention Based Real Image Editing Driven by Target-Text Inversion
Yuming Qiao ... Guo-Jun Qi
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Yuming Qiao, et. al.Yuming Qiao ... Guo-Jun Qi
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

High-Fidelity Diffusion-Based Image Editing
Chen Hou ... Zhibo Chen
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Chen Hou, et. al.Chen Hou ... Zhibo Chen
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Anycost GANs for Interactive Image Synthesis and Editing
Ji Lin ... Richard Zhang
-
Ji Lin, et. al.Ji Lin ... Richard Zhang
01 Jun 2021
01 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence