Document age estimation using handwritten text line images is useful for several pattern recognition and artificial intelligence applications such as forged signature verification, writer identification, gender identification, personality traits identification, and fraudulent document identification. This paper presents a novel method for document age classification at the text line level. For segmenting text lines from handwritten document images, the wavelet decomposition is used in a novel way. We explore multiple levels of wavelet decomposition, which introduce blur as the number of levels increases for detecting word components. The detected components are then used for a direction guided-driven growing approach with linearity, and nonlinearity criteria for segmenting text lines. For classification of text line images of different ages, inspired by the observation that, as the age of a document increases, the quality of its image degrades, the proposed method extracts the structural, contrast, and spatial features to study degradations at different wavelet decomposition levels. The specific advantages of DenseNet, namely, strong feature propagation, mitigation of the vanishing gradient problem, reuse of features, and the reduction of the number of parameters motivated us to use DenseNet121 along with a Multi-layer Perceptron (MLP) for the classification of text lines of different ages by feeding features and the original image as input. To demonstrate the efficacy of the proposed model, experiments were conducted on our own as well as standard datasets for both text line segmentation and document age classification. The results show that the proposed method outperforms the existing methods for text line segmentation in terms of precision, recall, F-measure, and document age classification in terms of average classification rate.
Read full abstract