Abstract
Abstract In a multilingual country like India, script recognition is an important pre-processing footstep necessary for feeding any document to an optical character recognition (OCR) engine, which is, in general, script specific. The present work evaluates the performance of an ensemble of two MLP (multi-layer perceptron) classifiers, each trained on different feature sets. Here, two complementary sets of features, namely, gray-level co-occurrence matrix (GLCM) and Gabor wavelets transform coefficients are extracted from each of the handwritten text-line and word images written in 12 official scripts used in Indian subcontinent, which are then fed into an individual classifier. In order to improve the overall recognition rate, a powerful combination approach based on the Dempster–Shafer (DS) theory is finally employed to fuse the decisions of two MLP classifiers. The performance of the combined decision is compared with those of the individual classifiers, and it is noted that a significant improvement in recognition accuracy (about 4% for text-line data and 6% for word level data) has been achieved by the proposed methodology.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have