Rapid movement generation models are described in the literature as an efficient tool to apprehend the handwriting behavior. Fields of application are diverse, including handwriting description, regeneration, and more recently OCR. In this paper, we propose a grapheme-based approach to offline Arabic writer identification and verification. Rather than extracting naturel graphemes from a training corpus using segmentation and clustering, it synthesizes its own graphemes based on the beta-elliptic model. Originality lies in the independence of the grapheme codebook from any training process, and the use of a model instead. One full and four partial codebooks are generated and tested. Using feature selection, raw codebooks are reduced in size with respect to FDR, FDR and cross-correlation, and random subsampling criteria. A total of 60 feature vectors are extracted using template matching, and evaluated with 411 individual writers from the IFN/ENIT database. The results presented in this study demonstrated the wide representativity and the good generalization capability of synthetic codebooks. We obtained a top1 rate=90.02% and a top5 rate=96.35% for writer identification, and an EER=2.1% for writer verification. Our approach showed better properties than most of the surveyed techniques in terms of supported corpus size and identification rates. To the best of our knowledge, this study is among the first to exploit the concept of model-based synthetic codebooks in writer identification and verification.
Read full abstract