Abstract

In this study, a natural scene text detection method based on the improved faster region-based convolutional neural network (R-CNN) is proposed. This method extracts image features with the Inception-ResNet architecture, adopts a region proposal network to generate region proposals for the extracted features, merges the fine-tuned features with the region proposals, and finally, uses Fast R-CNN to classify and locate text. The proposed method solves the problems of varying text sizes and the text being obscured in the image. Compared with the original Faster R-CNN, the multilevel Inception-ResNet network model presented in this study can extract deeper text features. The extracted feature map is further sparsely represented by Reduction B, Inception ResNet C and Avg Pool, and then is fused with text regions obtained by the text feature mapping lower layer network to acquire the exact text regions. The text detection method presented in this study is tested on the 2017 dataset of ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17), which contains a large number of distorted, blurry, different scale and size texts. An accuracy of 76.4% is achieved in this platform, thereby proving the efficiency of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.