Abstract

This article presents a handwritten Arabic alphabets, words and paragraphs dataset (AHAWP). The dataset contains 65 different Arabic alphabets (with variations on begin, end, middle and regular alphabets), 10 different Arabic words (that encompass all Arabic alphabets) and 3 different paragraphs. The dataset was collected anonymously from 82 different users. Each user was asked to write each alphabet and word 10 times. A userid uniquely but anonymously identifies the writer of each alphabet, word and paragraph. In total, the dataset consists of 53199 alphabet images, 8144 words images and 241 paragraphs images. This dataset can be used for multiple purposes. It can be used for optical handwriting recognition of alphabets and words. It can also be used for writer identification (or verification) of handwritten Arabic text. It is also possible to evaluate difference in writing styles of isolated alphabets as compared to the same alphabet written as part of the word or in paragraph by the same user using this dataset. The dataset is publicly available at https://data.mendeley.com/datasets/2h76672znt/1.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call