AbstractCodebook‐based writer characterization is an effective technique that has been investigated in a number of recent studies on identification and verification of writers. These methods divide a set of writing samples into small units (fragments or graphemes) and cluster these patterns to produce a codebook. Writer of a handwritten sample is then characterized by the probability (distribution) of producing the codebook patterns. In most cases, a small subset of the database under study is employed to produce the codebook while the rest of the database is used in evaluations. This work aims to validate the hypothesis that the codebook simply serves as a representation space to compare different writings and, in most cases, the patterns in the codebook do not significantly influence the identification and verification performance. The hypothesis is validated by generating a number of codebooks using Greek, Arabic and Chinese handwritten samples. Moreover, codebooks using fragments of handwritten music scores, printed text and synthetic data are also investigated. Evaluations on three well‐known handwriting databases (CVL, BFL and IAM) validate the idea that, in general, the codebook patterns do not have a significant impact on characterizing writer from handwriting.
Read full abstract