Optical Character Recognition Using Deep Learning Techniques for Printed and Handwritten Documents

Sanika Bagwe,Vruddhi Shah,Urvi Bheda,Vishant Mehta,Amanshu Tiwari,Vrushabh Gada,Ninad Mehendale,Jugal Chauhan,Vartika Gupta,Durva Raikar,Mahesh Warang,Purvi Harniya

doi:10.2139/ssrn.3664620

Abstract

Despite decades of research, developing optical character recognition (OCR) systems with capabilities comparable to that of a human remains an open challenge. A large scale of documents in the form of the image is needed to be entered into computer databases which takes a lot of memory as compared to editable text and there can be errors while interpretation of data from an image. This project aims to use OCR to convert handwritten or printed documents into editable text. Documents are scanned to image format as an input to a doc_class_net which is a full-size image classifier that classifies the input image into four different classes viz. printed, semi-printed, handwritten discrete, and handwritten cursive. The OCR model predicts and then decodes the text in the image and gives the output as an editable text. We have applied OCR to printed text images using the Pytesseract. For handwritten text images, the text is predicted using a self-developed convolutional recurrent neural network (CRNN) named CL-9 (7 CNN layers and 2 LSTM layers). The accuracy of the doc_class_net classifier and line_class_net classifier(line-wise classifier) was 88.03 % and 82.1 % respectively. The overall accuracy for printed, handwritten discrete and handwritten cursive obtained is 94.79 %, 75.2 %, and 65.7 % respectively. OCR has real-time applications in various fields like medical prescriptions, smart libraries, and tax returns. Using this method books, magazines, and any other form of documents can be digitized and made accessible very efficiently.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optical Character Recognition Using Deep Learning Techniques for Printed and Handwritten Documents

Abstract

Talk to us

Similar Papers

More From: SSRN Electronic Journal

Lead the way for us

Journal: SSRN Electronic Journal	Publication Date: Sep 14, 2020
Citations: 5

Similar Papers

Adaptive image restoration of text images that contain touching or broken characters
P Stubberud ... J Kanai
-
P Stubberud, et. al.P Stubberud ... J Kanai
14 Aug 1995
14 Aug 1995

Soft Computing Techniques for Optical Character Recognition Systems
Arindam Chaudhuri ... Pratixa Badelia
-
Arindam Chaudhuri, et. al.Arindam Chaudhuri ... Pratixa Badelia
24 Dec 2016
24 Dec 2016

JPEG for Arabic Handwritten Character Recognition: Add a Dimension of Application
Abdurazzag Ali ... Salem Ali
-
Abdurazzag Ali, et. al.Abdurazzag Ali ... Salem Ali
01 Oct 2008
01 Oct 2008

Retrieving and combining repeated passages to improve OCR
...
-
, et. al. ...
19 Jun 2017
19 Jun 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optical Character Recognition Using Deep Learning Techniques for Printed and Handwritten Documents

Abstract

Talk to us

Similar Papers

More From: SSRN Electronic Journal