Abstract
Text recognition in the wild is a challenging task in the field of computer vision and machine learning. Existing optical character recognition engines cannot perform well in the natural scene. In this context, deep learning models have emerged as a powerful state-of-the-art technique in the classification and recognition process. This study proposes a new Convolutional Neural Network based system for scene text reading. We investigate how to combine the character recognition module followed by the word recognition module to achieve the overall system goal. The first module analyzes characters within multi-scale images by relaying on the power of the convolutional network and the fully connected network for character recognition. The second module relies on the Viterbi search to find the closest word to a given characters sequence. For the sake of more precision, a bigram based linguistic module is applied. The proposed system achieves the state-of-the-art performance on three standard scene text recognition benchmarks: chars74k, ICDAR 2003 and ICDAR 2013. In particular, this performance is proven on both of character and word recognition accuracy as well as speed aspects via a comparative study between different deep learning architectures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.