Abstract

Recognition of texts in scenes is one of the most important tasks in many computer vision applications. Though different scene text recognition techniques have been developed, scene text recognition under a generic condition is still a very open and challenging research problem. One major factor that defers the advance in this research area is character touching, where many characters in scene images are heavily touched with each other and cannot be segmented for recognition. In this paper, we proposed a novel scene text recognition technique that performs word level recognition without character segmentation. Our proposed technique has three advantages. First it converts each word image into a sequential signal for the scene text recognition. Second, it adapts the recurrent neural network (RNN) with Long Short Term Memory (LSTM), the technique that has been widely used for handwriting recognition in recent years. Third, by integrating multiple RNNs, an accurate recognition system is developed which is capable of recognizing scene texts including those heavily touched ones without character segmentation. Extensive experiments have been conducted over a number of datasets including several ICDAR Robust Reading datasets and Google Street View dataset. Experiments show that the proposed technique is capable of recognizing texts in scenes accurately.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.