Abstract

Standard databases provide for evaluation and comparison of various pattern recognition techniques by different researchers; thus they are essential for the advance of research. There are different handwritten databases in various languages, but there is not a large standard database of handwritten text for the evaluation of different algorithms for writer identification and verification in Farsi. This paper introduces a large handwritten Farsi text database called HaFT. The database contains 1800 gray scale images of unconstrained text written by 600 writers. Each participant gave three separate eight-line samples of his handwriting, each of which was written at a different time on a separate sheet. HaFT is presented in several versions each including different lengths of text and using identical or different writing instruments. A new measure, called CVM, is defined which effectively reflects the size of handwriting and thus the content volume of a given text image. This database is designed for training and testing Farsi writer identification and verification using handwritten text. In addition, the database can also be used in training and testing handwritten Farsi text segmentation and recognition algorithms. HaFT is available for research use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call