Recognition of Persian/Arabic Handwritten Words Using a Combination of Convolutional Neural Networks and Autoencoder (AECNN)

Sara Khosravi,Abdolah Chalechale,Ramin Ranjbarzadeh

doi:10.1155/2022/4241016

Sara Khosravi, Abdolah Chalechale + Show 1 more

Open Access

https://doi.org/10.1155/2022/4241016

Copy DOI

Journal: Mathematical Problems in Engineering	Publication Date: Jul 8, 2022
Citations: 2	License type: CC BY 4.0

Affiliation: Razi University

Abstract

Despite extensive research, recognition of Persian and Arabic manuscripts is still a challenging problem due to the complicated and irregular nature of writing, wide vocabulary, and diversity of handwritings. In Persian and Arabic words, letters are joined together, and signs such as dots are placed above or below letters. In the proposed approach, the words are first decomposed into their constituent subwords to enhance the recognition accuracy. Then the signs of subwords are extracted to develop a dictionary of main subwords and signs. The dictionary is then employed to train a classifier. Since the proposed recognition approach is based on unsigned subwords, the classifier may make a mistake in recognizing some subwords of a word. To overcome this, a new subword fusion algorithm is proposed based on the similarity of the main subwords and signs. Here, convolutional neural networks (CNNs) are utilized to train the classifier. An autoencoder (AE) network is employed to extract appropriate features. Thus, a hybrid network is developed and named AECNN. The known Iranshahr dataset, including nearly 17000 images of handwritten names of 503 cities of Iran, was employed to analyze and test the proposed approach. The resultant recognition accuracy is 91.09%. Therefore, the proposed approach is much more capable than the other methods known in the literature.

Full Text