Abstract

Text detection in natural scene images is a key prerequisite for computer vision tasks such as image search, blind navigation, autopilot, and multi-language translation. Existing text detection methods only detect partial region of large-scale texts and are difficult to detect small-scale texts. Aiming at this problem, an anchor-free multi-orientation text detection method is proposed. Firstly, Feature Pyramid Network (FPN) is used to combine the multiple feature layers of Convolutional Neural Network (CNN) to predict the geometric properties of text, which can be used to expand the receptive field of each pixel and thus help to detect more large-scale texts. Secondly, a new loss function independent of the scale of text is designed, which enables the pixels in the small-scale text to have a larger calculation weight, thereby facilitating the detection of small-scale texts. Finally, the results of pixel-level semantic segmentation are used to filter obviously unreasonable candidate text boxes, and at the same time improve the accuracy and recall rate of text detection. The experimental results on ICDAR 2015 and MSRA-TD500 prove the good performance of our method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.