Abstract

Scene text segmentation, which aims to generate pixel-level text masks, is an integral part of many fine-grained text tasks, such as text editing and text removal. Multi-scale irregular scene texts are often trapped in complex background noise around the image, and their textures are diverse and sometimes even similar to those of the background. These specific problems bring challenges that make general segmentation methods ineffective in the context of scene text. To tackle the aforementioned issues, we propose a new scene text segmentation pipeline called Attention and Recognition enhanced Multi-scale segmentation Network (ARM-Net), which consists of three main components: Text Segmentation Module (TSM) generates rectangular receptive fields of various sizes to fit scene text and integrate global information adequately; Dual Perceptual Decoder (DPD) strengthens the connection between pixels that belong to the same category from the spatial and channel perspective simultaneously during upsampling, and Recognition Enhanced Module (REM) provides text attention maps as a prior for the segmentation network, which can inherently distinguish text from background noise. Via extensive experiments, we demonstrate the effectiveness of each module of ARM-Net, and its performance surpasses that of existing state-of-the-art scene text segmentation methods. We also show that the pixel-level mask produced by our method can further improve the performance of text removal and scene text recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.