Abstract

Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes and close or arbitrary-shaped texts. In this paper, StrokeNet proposes to effectively detect the texts by capturing the fine-grained strokes and inferring structural relations between the hierarchical representations of each text area in the graph-based network. Different from existing approaches that represent the text area by a series of points or rectangular boxes, we directly localize the strokes of each text instance. We introduce Stroke Assisted Prediction Network (SAPN), which performs hierarchical representation learning of text areas, effectively capturing extremely small or low-resolution texts. We extract a series of text- and stroke-level rectangular boxes on the predicted text areas, which are treated as graph nodes and grouped to form the corresponding local graphs. Hierarchical Relation Graph Network (HRGN) then performs relational reasoning and predicts the likelihood of linkages among graph nodes of different levels. It efficiently splits the close text instances and grouping node classification results into the arbitrary-shaped text area. We introduce a novel dataset with stroke-level annotations, namely <i>SynthStroke</i>, for offline pre-training of widespread text detectors. Experiments on benchmarks verify the State-of-the-Art performance of our method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.