Abstract

The optical character recognition (OCR) system is still an active research field in pattern recognition. Such systems can identify, recognize and distinguish electronically between characters and texts, printed or handwritten. They can also do a transformation of such data type into machine-processable form to facilitate the interaction between user and machine in various applications. In this paper, we present the global structure of an OCR system, with its types (on-line and off-line), categories (printed and handwritten) and its main steps. We also focused on off-line handwritten Arabic character recognition and provided a list of the main datasets publicly available. This paper also presents a survey of the works that have been carried out over recent years. Finally, some open issues and potential research directions have been highlighted

Highlights

  • The automatic text recognition, known as optical character recognition (OCR), is a process in which the contents are transformed into comprehensible and machine-process-able representation for the purpose of archiving, conducting research, editing, reusing and transmitting the information

  • The outcomes of OCR systems can be used in various applications such as automatic check processing in banks, automatic mail sorting, filled form processing, postal code recognition, and writer/gender/ personality identification

  • Online handwritten character recognition systems further can be divided into (1) writer dependent and (2) writer independent character recognition system (Kasturi, 2002). 3.2 Off-line Systems This type of Arabic OCR systems is used in recognizing manuscripts from written/printed documents, the image of the written/printed text is scanned using a scanner

Read more

Summary

A Survey on Arabic Handwritten Script Recognition Systems

Soumia Djaghbellou, Department of Computer Science, University of Mohammed El Bachir El Ibrahimi, Bordj Bou Arreridj, Algeria Abderraouf Bouziane, LMSE Laboratory, Department of Computer Science, University of Mohammed El Bachir El Ibrahimi, Bordj Bou Arreridj, Algeria Abdelouahab Attia, Independent Researcher, Algeria https://orcid.org/0000-0003-1558-7273

INTRODUCTION
ARABIC OCR SYSTEMS
Printed Characters
Feature Extraction Stage
Classification Performance Measures
LITERATURE ON ARABIC HANDWRITTEN RECOGNITION SYSTEMS
Generalization Ability
Use of Deep Learning
Reproducible Research
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call