Mask-guided GAN for robust text editing in the scene

Boxi Yu,Yong Xu,Yan Huang,Shuai Yang,Jiaying Liu

doi:10.1016/j.neucom.2021.02.045

Abstract

Text editing in the scene aims to replace the source text with target text while maintaining the original text style and background, which involves text detection, style transfer and image inpainting techniques. Due to diversified text styles in the real scene, such as colorful outlines and shadows, it is quite challenging to edit complicated text. To address this problem, a mask-guided GAN method is proposed to adequately use the body, outline, and shadow of text to guide the task. First, a mask generating module is designed to detect the body, outline, and shadow regions in the text image. The generated mask in the source image is then used for the proposed text inpainting module and background inpainting module to extract source stylish text and restore the background, respectively. Next, a shape transfer module is designed to infer the target mask that depicts the text structural style from the source mask, which guides a style transfer module to transfer the text style onto the target image. Finally, a fusion module is developed to fuse the target text and background. Each module fulfills its own functional role and collaborates with each other, thus decomposing the complex task into easy-to-learn ones. Experiments and comparisons demonstrate the effectiveness of the proposed method.

Full Text