Abstract

This paper describes the establishment of an online and offline Japanese handwritten character dataset and experiments on handwriting identification. Handwriting data were collected for forensic writer identification with 5-time repetition to examine intra-individual constancy, which was also as important as inter-individual difference. Handwritten samples were collected from four hundred and one participants. Online data included the data on the pen-tip location, the pen pressure and the time at each measuring point. The dataset had 702 characters which included uppercase and lowercase Latin alphabet letters, numerals, Hiragana, Katakana and Kanji characters. Three kinds of writer classification experiments were done using the offline data in the dataset. Image data used were 26 uppercase and 26 lowercase Latin alphabet letters, 26 Hiragana characters and 26 Kanji characters written by 10 writers. In each experiment, four samples out of 5 samples per writer and per letter or character were used for training and rest of one sample was used for test. LeNet was applied to the classification. Correct classification ratio was about 80% except lowercase letters. Lowercase letters showed 70%. Correct ratio improved by the increase in the number of the training data. Some participants showed high correct ratio and their samples were unique regarding the size, the position or the line quality. These results suggested the importance of the sample size in the classification using a Convolutional Neural Network rather than the equality of writing conditions and that the existence of the characteristics consistent to all handwriting of a writer.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call