Recognition of Cursive Pashto Optical Digits and Characters with Trio Deep Learning Neural Network Models

Muhammad Zubair Rehman,Nazri Mohd Nawi,Abdullah Khan,Mohammad Arshad

doi:10.3390/electronics10202508

Muhammad Zubair Rehman, Nazri Mohd Nawi + Show 2 more

Open Access

https://doi.org/10.3390/electronics10202508

Copy DOI

Abstract

Pashto is one of the most ancient and historical languages in the world and is spoken in Pakistan and Afghanistan. Various languages like Urdu, English, Chinese, and Japanese have OCR applications, but very little work has been conducted on the Pashto language in this perspective. It becomes more difficult for OCR applications to recognize handwritten characters and digits, because handwriting is influenced by the writer’s hand dynamics. Moreover, there was no publicly available dataset for handwritten Pashto digits before this study. Due to this, there was no work performed on the recognition of Pashto handwritten digits and characters combined. To achieve this objective, a dataset of Pashto handwritten digits consisting of 60,000 images was created. The trio deep learning Convolutional Neural Network, i.e., CNN, LeNet, and Deep CNN were trained and tested with both Pashto handwritten characters and digits datasets. From the simulations, the Deep CNN achieved 99.42 percent accuracy for Pashto handwritten digits, 99.17 percent accuracy for handwritten characters, and 70.65 percent accuracy for combined digits and characters. Similarly, LeNet and CNN models achieved slightly less accuracies (LeNet; 98.82, 99.15, and 69.82 percent and CNN; 98.30, 98.74, and 66.53 percent) for Pashto handwritten digits, Pashto characters, and the combined Pashto digits and characters recognition datasets, respectively. Based on these results, the Deep CNN model is the best model in terms of accuracy and loss as compared to the other two models.

Highlights

Having a basic knowledge of the reading and semantics of any specific language or script enables a native human to read and understand text documents in that language
This study proposes the use of Deep Convolutional Neural Network (DCNN) for recognition of three different datasets, i.e., Pashto digits, Pashto characters, and combined Pashto digit and character datasets
The proposed model (Deep CNN)’s performance was compared with CNN and LeNet

Summary

Introduction

Having a basic knowledge of the reading and semantics of any specific language or script enables a native human to read and understand text documents in that language. Due to the similarity in writing styles, the printed characters or digits are easy to train and recognize [18], but handwritten scripts vary from person to person, which is somehow easy for humans to understand but becomes a challenging task for a machine to recognize, especially when there are multiple shapes for a single character. To overcome this challenge of training a machine with Pashto digits, a proper dataset for Pashto digits is needed.

Related Work

The Proposed Methodology

Data Collections

Results and Discussion

Preliminaries

Experiments

Pashto Character Dataset

Pashto Digit Dataset

Combined Pashto Digit and Character Dataset

Conclusions and Future Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Oct 15, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Recognition of Cursive Pashto Optical Digits and Characters with Trio Deep Learning Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Enhanced ResNet-151-based fused features for optimized Bi-LSTM-DNN-aided handwritten character and digits recognition
Nelson Kennedy Babu C ... Srinivasa Rao N
Expert Systems With Applications | VOL. 244
Nelson Kennedy Babu C, et. al.Nelson Kennedy Babu C ... Srinivasa Rao N
08 Dec 2023
Expert Systems With Applications | VOL. 244

Pioneer dataset and automatic recognition of Urdu handwritten characters using a deep autoencoder and convolutional neural network
Shahid Khattak ... Talha Iqbal
SN Applied Sciences | VOL. 2
Shahid Khattak, et. al.Shahid Khattak ... Talha Iqbal
03 Jan 2020
SN Applied Sciences | VOL. 2

Bangla Handwritten Character and Digit Recognition Using Deep Convolutional Neural Network on Augmented Dataset and Its Applications
Hasibul Huda ... Amit Kumar Das
-
Hasibul Huda, et. al.Hasibul Huda ... Amit Kumar Das
03 Jan 2022
03 Jan 2022

Research on application of an improved deep convolutional neural network in handwritten character recognition
Canshi Zhu ... Xueying Jia
Journal of Physics: Conference Series | VOL. 1629
Canshi Zhu, et. al.Canshi Zhu ... Xueying Jia
01 Sep 2020
Journal of Physics: Conference Series | VOL. 1629

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recognition of Cursive Pashto Optical Digits and Characters with Trio Deep Learning Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics