Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Lingjun Zhang,Yaohui Wang,Xinyuan Chen,Yu Qiao,Yue Lu

doi:10.1609/aaai.v38i7.28550

Abstract

Recently, diffusion-based image generation methods are credited for their remarkable text-to-image generation capabilities, while still facing challenges in accurately generating multilingual scene text images. To tackle this problem, we propose Diff-Text, which is a training-free scene text generation framework for any language. Our model outputs a photo-realistic image given a text of any language along with a textual description of a scene. The model leverages rendered sketch images as priors, thus arousing the potential multilingual-generation ability of the pre-trained Stable Diffusion. Based on the observation from the influence of the cross-attention map on object placement in generated images, we propose a localized attention constraint into the cross-attention layer to address the unreasonable positioning problem of scene text. Additionally, we introduce contrastive image-level prompts to further refine the position of the textual region and achieve more accurate scene text generation. Experiments demonstrate that our method outperforms the existing method in both the accuracy of text recognition and the naturalness of foreground-background blending.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

Rethinking Multilingual Scene Text Spotting: A Novel Benchmark and a Character-Level Feature Based Approach
Siliang Ma ... Yong Xu
American Journal of Computer Science and Technology | VOL. 7
Siliang Ma, et. al.Siliang Ma ... Yong Xu
06 Sep 2024
American Journal of Computer Science and Technology | VOL. 7

Enhanced scene text recognition using deep learning based hybrid attention recognition network
Ratnamala S Patil ... Rakesh Huded
IAES International Journal of Artificial Intelligence (IJ-AI) | VOL. 13
Ratnamala S Patil, et. al.Ratnamala S Patil ... Rakesh Huded
01 Dec 2024
IAES International Journal of Artificial Intelligence (IJ-AI) | VOL. 13

Multi-lingual scene text detection and language identification
Shaswata Saha ... Ram Sarkar
Pattern Recognition Letters | VOL. 138
Shaswata Saha, et. al.Shaswata Saha ... Ram Sarkar
27 Jun 2020
Pattern Recognition Letters | VOL. 138

An Optimized Ant Colony Algorithm for Text Edge Extraction
Qubo Xie ... Ke Zhou
-
Qubo Xie, et. al.Qubo Xie ... Ke Zhou
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence