Abstract

Deep learning is a subfield of artificial intelligence that allows the computer to adopt and learn some new rules. Deep learning algorithms can identify images, objects, observations, texts, and other structures. In recent years, scene text recognition has inspired many researchers from the computer vision community, and still, it needs improvement because of the poor performance of existing scene recognition algorithms. This research paper proposed a novel approach for scene text recognition that integrates bidirectional LSTM and deep convolution neural networks. In the proposed method, first, the contour of the image is identified and then it is fed into the CNN. CNN is used to generate the ordered sequence of the features from the contoured image. The sequence of features is now coded using the Bi-LSTM. Bi-LSTM is a handy tool for extracting the features from the sequence of words. Hence, this paper combines the two powerful mechanisms for extracting the features from the image, and contour-based input image makes the recognition process faster, which makes this technique better compared to existing methods. The results of the proposed methodology are evaluated on MSRATD 50 dataset, SVHN dataset, vehicle number plate dataset, SVT dataset, and random datasets, and the accuracy is 95.22%, 92.25%, 96.69%, 94.58%, and 98.12%, respectively. According to quantitative and qualitative analysis, this approach is more promising in terms of accuracy and precision rate.

Highlights

  • Understanding the visual scene is an active research area for the computer vision community

  • We proposed a scene text recognition method, and the proposed system is divided into three steps: in the first step, adaptive binarization technique is applied to the image so that the noise can be removed from the image, and it helps to extract the features from the blurred and complex background

  • CNN is combined with Bi-LSTM to make the classifier more powerful, and it is a handy tool for extracting the features from the sequence of words. is paper combines the two powerful mechanisms for extracting the features from the image and contour-based input image making the recognition process faster, which makes this technique better compared to existing methods. e complete detail of the proposed method is discussed in Section 3. e rest of the paper is structured as follows: Section 2 describes the related work

Read more

Summary

Introduction

Understanding the visual scene is an active research area for the computer vision community. It needs enormous research in the field of computer vision and its subfields. Visual scene understanding includes the processing of both image and text, and it is always a difficult task to understand the scene and read the text written in the image. Ese are the various challenges of text detection and text recognition. Text recognition allows the computer to understand and predict the text in the given input scene image and convert it into the computer’s understandable format. Text recognition is the most popular method for converting old printed documents into digitized forms.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call