Abstract

Text, which is regarded as one of the important clues for visual recognition, can provide rich and accurate high-level semantic information. Therefore, the detection and recognition of textual data have become a research hotspot in computer vision and artificial intelligence. However, the difficulty of data collection and the non-uniform distribution of characters still poses challenges for accurate text recognition, especially for recognizing complicated character sets, such as Chinese. To address small-sample text recognition, we propose an improved image-based text transfer framework, named mathrm T^2Net. This work can replace or modify the text content in an image so as to arbitrarily expand a recognition data set. Considering that the main challenge of text transfer lies in decoupling the complex interrelationship between text and background, a text content mask branch is first added into a background inpainting module so as to more realistically restore background textures. Second, a text recognition model is developed to guide the readability of the text transfer results in the text conversion module. Finally, a text fusion module is used to fuse the independent migrations of background and text. We examined the performance of our proposed framework in a real-word scene text recognition data set. Qualitative and quantitative results have proved the efficiency of our method in comparison with previous works.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.