Abstract

Multi-oriented text detection in the wild is a challenging task due to the variations of scales, orientations, illumination, and languages. The traditional anchor mechanism on generic object detection can only generate horizontal proposals, which cannot be applied to detecting multi-oriented text regions. Considering this, in this paper, we propose a novel convolutional regression network (CRN) to localize multi-oriented text in natural images, which consists of two components: region proposal extractor and text locator. To be specific, we first present a hierarchical deconvolution module (HDM), a text-line and geometry segmentation module (TGM) to segment the multi-oriented proposals accurately, both of which are fully convolutional networks. Then, a classification and regression module (CRM) is adopted to process the proposals and obtain the final localization results. The whole framework can be trained in an end-to-end mechanism which is suitable for detecting multi-oriented texts. The extensive experiments are conducted on three mainstream scene-text datasets, and the experimental results evidence the proposed CRN achieves competitive performance.

Highlights

  • Reading text from the natural images has attracted much attention in the field of computer vision because of its numerous applications, such as image retrieval [1]–[4], robot navigation [5], [6], video analysis [7]–[9] and scene understanding [10]–[14]

  • We propose a single framework combining segmentation and detection in an end-to-end manner, which is named as Convolutional Regression Network (CRN)

  • We develop two modules to handle multi-oriented text, namely Hierarchical Deconvolution Module (HDM), Text-line and Geometry segmentation Module (TGM)

Read more

Summary

Introduction

Reading text from the natural images has attracted much attention in the field of computer vision because of its numerous applications, such as image retrieval [1]–[4], robot navigation [5], [6], video analysis [7]–[9] and scene understanding [10]–[14]. Accurate text localization/ detection [15]–[18] is a prerequisite for effectively understanding text. We focus on complicated multi-oriented text detection task. Previous works related to text detection usually contain many sequential steps, including character detection, character classification, text line construction and word splitting. These multi-step approaches are complicated and the error may be accumulated with the increase of steps. Many methods [19], [20] based on generic object detection

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.