Abstract
this paper presents a comparative performance analysis of feature(s)-classifier combination for Devanagari optical character recognition system. For performance evaluation, three classifiers namely support vector machines, artificial neural networks and k-nearest neighbors, and seven feature extraction approaches viz. profile direction codes, transition, zoning, directional distance distribution, Gabor filter, discrete cosine transform and gradient features have been used. The first four features have been used jointly as statistical features. The performance has also been evaluated by using the combination of these feature extraction approaches. In addition, performance evaluation has also been done by varying the feature vector length of Gabor and DCT features. For training the classifiers, 7000 samples of first 70 classes (out of 942 classes), recognized in the earlier work have been used. Such a large number of classes are due to the horizontal and vertical fusion/overlapping characters. We have chosen first 70 classes as their percentage contribution out of 942 classes has found to be 96.69%. For testing, 1400 samples have been collected separately. A corpus of 25 books has been used for sample collection. Classifiers trained on different features, have been compared for performance evaluation. It has been found that support vector machines trained with Gradient features provide the classification correctness of 99.429%, and there is no significant increase in the performance with the increase in the feature vector length.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have