Anchor-free multi-orientation text detection in natural scene images

Liqiong Lu,Tao Wu,Faliang Huang,Yaohua Yi,Dong Wu

doi:10.1007/s10489-020-01742-z

Abstract

Text detection in natural scene images is a key prerequisite for computer vision tasks such as image search, blind navigation, autopilot, and multi-language translation. Existing text detection methods only detect partial region of large-scale texts and are difficult to detect small-scale texts. Aiming at this problem, an anchor-free multi-orientation text detection method is proposed. Firstly, Feature Pyramid Network (FPN) is used to combine the multiple feature layers of Convolutional Neural Network (CNN) to predict the geometric properties of text, which can be used to expand the receptive field of each pixel and thus help to detect more large-scale texts. Secondly, a new loss function independent of the scale of text is designed, which enables the pixels in the small-scale text to have a larger calculation weight, thereby facilitating the detection of small-scale texts. Finally, the results of pixel-level semantic segmentation are used to filter obviously unreasonable candidate text boxes, and at the same time improve the accuracy and recall rate of text detection. The experimental results on ICDAR 2015 and MSRA-TD500 prove the good performance of our method.

Full Text