Abstract

An optical character recognition (OCR) system is a type of software that can automatically analyse printed text and turn it into a form that a computer can process. Feature extraction is one of the important steps in OCR. Handwritten character recognition (HCR) is a difficult area in OCR. Different phases of HCR system are pre-processing, feature extraction, classification and recognition. Feature extraction deals with collecting unique features of characters in a language that is used to identify and recognize characters in that language. Factors that affect the performance of a HCR system is the selection of a suitable set of features for representing input images. This paper discusses about feature extraction and classification of Malayalam characters by using a new set of geometrical features. One of the main problems in HCR system is that we cannot efficiently recognize some characters that have identical structural features The main reason for this problem is that we are using normal geometrical properties. To solve this problem, this paper proposes a feature extraction method for Malayalam language by using a different set of structural features. In this method, positions of normal geometric features like loops, endpoints and junction points are used. In addition to these, orientations of arcs and different types of junction point features are also used to identify the characters in Malayalam language. SVM classifier is used as a classification technique.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call