Abstract

In this paper, an Optical Character Recognition engine for Kannada and English character recognition is proposed based on zone features. The zone is one of the old concepts in case of document image analysis research. But this method is good in case of Kannada and English character recognition. The total of 2800 Kannada consonants and 2300 English lowercase alphabets sample images are classified based on the SVM classifier. All preprocessed images are normalized into 32 x 32 dimensions, it is optimum. Then the preprocessed image is divided into 64 zones of non overlapping and zone based pixel density is calculated for each of the 64 zones, there by generating 64 features. These features are fed to the SVM classifier for classification of character images. To test the performance of an algorithm 2 fold cross validation is used. The average recognition accuracy of 73.33% and 96.13% is obtained for Kannada consonants and English lowercase alphabets respectively. Further the average percentage of recognition accuracy of 83.02% is obtained for mixture input of both Kannada and English characters. The recognition accuracy obtained for Kannada consonants is low, because most of the characters are similar in shape. Hence, one may need to add some more dominating features to discriminating the characters. In this direction, the work is in progress. It is an initial attempt for mixture of Kannada and English characters recognition with single algorithm. The novelty of the algorithm is independent of thinning and slant of the characters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call