Holistic Vertical Regional Proposal Network for scene text detection

Xu Chen,Jun Zhang Jun Zhang,Qiang Guo Qiang Guo,Shuohao Li Shuohao Li

doi:10.1109/icivc.2017.7984521

Abstract

Scene text detection is an important research problem in computer vision community. It has great application value in many fields. Inspired by Faster-RCNN which is a popular method for object detection, we consider to apply the Regional Proposal Network (RPN) method for scene text detection because text can be regarded as the common object. The core of RPN is to detect different sizes of objects with different sizes of anchors. However, when the RPN is applied directly, it is difficult to design many different scale anchors to meet the requirements of different sizes of text boxes. For the above reasons, we adjust the anchor settings and take advantage of vertical anchor to break the restrictions of receptive field. In addition, we refer to the multi-scale network Holistically-Nested Edge Detection (HED) which produce side-output results at different steps of the neural network. The bottom layers have a smaller receptive field, which represent the features of small text area in image. The receptive field of the high-level side-outputs is larger, and it can handle the large-size text area better. We combine the advantages of RPN and HED methods and propose a Holistic Vertical Proposal Regional Network (HVRPN) for scene text detection, and our model shows good results in ICDAR03 and ICDAR11.

Full Text