Abstract

Text style transfer is a challenging problem in optical character recognition. Recent advances mainly focus on adopting the desired text style to guide the model to synthesize text images and the scene is always ignored. However, in natural scenes, the scene and text are a whole. There are two key challenges in scene text image translation: i) transfer text and scene into different styles, ii) keep the scene and text consistency. To address these problems, we propose a novel end-to-end scene text style transfer framework that simultaneously translates the text instance and scene background with different styles. We introduce an attention style encoder to extract the style codes for text instances and scene and we perform style transfer training on the cropped text area and scene separately to ensure the generated images are harmonious. We evaluate our method on the ICDAR2015 and MSRA-TD500 scene text datasets. The experimental results demonstrate that the synthetic images generated by our model can benefit the scene text detection task.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.