Abstract
Fraudsters often alter handwritten contents in a document in order to achieve illicit purposes. At times, this may result in financial and mental loss to an individual or an organization. Hence, ink analysis is necessary to identify such an alteration. Convolution Neural Network (CNN) can be used to identify such cases of alteration, as CNN has emerged as a monumental success in the field of computer vision for varieties of classification tasks. But, CNN requires large amount of labeled data for training. Hence, there is a need to generate a large dataset for the experiments relating to handwritten word alteration detection. Collection, digitization, and cropping of a large number of altered and unaltered handwritten words are tedious and time consuming. To overcome such an issue, an approach for synthetic word data generation is presented in this paper for handwritten word alteration detection experiments. This scheme is designed in such a way that the synthetically generated words are very similar to the original ones. In order to achieve this, handwritten character data set is prepared using 10 blue and 10 black pens. These handwritten characters are used for creating synthetic word alteration data set. The presented approach uses relatively less number of handwritten character images to create a huge word alteration data set. Further, deep learning models are trained on the synthetically generated data set for word alteration detection.
Highlights
Most of the traditionally considered powerful ink analysis techniques require physical copy of the document for alteration detection
Ten blue and ten black ink pens are used here to write the characters in English alphabet. As this dataset is created for the experiments on pen ink differentiation for handwritten document forensics, twenty different pens are used for this task
A method of synthetically creating a large dataset for the said problem has been proposed in this paper
Summary
Most of the traditionally considered powerful ink analysis techniques require physical copy of the document for alteration detection These techniques are destructive in nature, such as thin layer chromatography [1], high performance liquid chromatography [2], [3], and infrared spectrum analysis of diffuse reflectance [4]. Several non-destructive techniques based on hyperspectral imaging [5]–[9], Raman spectroscopy [10], [11], and luminescence lifetime [12] are introduced subsequently. These techniques, involving spectral response, require special imaging devices. This paper will discuss only the later category of techniques
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have