Abstract

AbstractOptical character recognition (OCR) is the electronic conversion of digital images of handwritten or printed text—it could be from a scanned file, a book, a photo of a document, or a scene photo—into editable and searchable data for deeper and further processing. The aim of this work is to enable a device to automatically recognize text in images for use in other applications, such as in self- driving cars. We build a convolution neural network (CNN) model that can effectively recognize English handwritten characters and digits. The CNN is a common form of deep neural networks capable of extracting features and classifying characters with high accuracy and low dimensions. This paper proposes a CNN model using TensorFlow, which is an open-source library for machine learning. It can be used across a range of tasks, but it focuses particularly on the training and inference of deep neural networks. Our proposed CNN model trained and tested the Extended Modified National Institute (EMNIST) dataset, which contains 131,600 characters across 47 classes of letters and digits. The use of CNN leads to significant improvements across different machine learning classification algorithms. Compared with common proposed methods, our CNN gives an average accuracy of 89.46% within an acceptable computational time on local CPU of personal computer.KeywordsOptical character recognitionDeep learningConvolutional neural networkEnglish handwritten detection

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call