Abstract

Convolutional Neural Network (CNN) based scene text detection methods mostly employ the semantic segmentation (text/non-text classification) task to localize the regions of texts. However, they cannot distinguish different text-lines like instance segmentation. In this paper, we propose a novel framework based on Fully Convolutional Networks (FCN) and Recurrent Neural Network (RNN) to achieve both scene text detection and instance segmentation. The FCN is used to classify text and non-text regions, and the RNN utilizes the features extracted by FCN to simultaneously detect and segment one text instance at each time step. Meanwhile, it also extracts bounding boxes by a much simpler way than the non-maximum suppression (NMS) method. The proposed method achieves competitive results on two public benchmarks including ICDAR 2015 Incidental Scene Text Dataset and ICDAR 2013 Focused Scene Text Dataset. Moreover, the benefits of adding regression task in the RNN module are manifested.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call