Abstract
Text detection/localization, as an important task in computer vision, has witnessed substantialadvancements in methodology and performance with convolutional neural networks. However, the vastmajority of popular methods use rectangles or quadrangles to describe text regions. These representationshave inherent drawbacks, especially relating to dense adjacent text and loose regional text boundaries,which usually cause difficulty detecting arbitrarily shaped text. In this paper, we propose a novel text regionrepresentation method, with a robust pipeline, which can precisely detect dense adjacent text instances witharbitrary shapes. We consider a text instance to be composed of an adaptive central text region mask anda corresponding expanding ratio between the central text region and the full text region. More specifically,our pipeline generates adaptive central text regions and corresponding expanding ratios with a proposedtraining strategy, followed by a new proposed post-processing algorithm which expands central text regionsto the complete text instance with the corresponding expanding ratios. We demonstrated that our new textregion representation is effective, and that the pipeline can precisely detect closely adjacent text instances ofarbitrary shapes. Experimental results on common datasets demonstrate superior performance o
Highlights
As a fundamental task in computer vision, accurate text detection is applicable to many fields in the real world including automatic identity recognition, financial document analysis and recognition, and environmental understanding
Text instances in the natural image usually consist of arbitrary shapes
We propose that our network is trained with different text instances labeled by different expanding ratios and corresponding central text regions of the same image
Summary
As a fundamental task in computer vision, accurate text detection is applicable to many fields in the real world including automatic identity recognition, financial document analysis and recognition, and environmental understanding. In the era of deep learning, the community has witnessed substantial advancements in methodology and performance of text detection This task, is facing many challenges because of various image attributes, such as complex backgrounds, lighting conditions and arbitrary shapes. Our proposed method focuses on the arbitrary-shaped text detection with the same as previous state-of-the-art approaches. The central text region mask has a similar shape to the original text instance, which allows the proposed method to represent texts of arbitrary shapes and locate each text instance precisely. A novel text region representation method, which can precisely describe dense adjacent text instances with arbitrary shapes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.