This paper describes a lexical analysis (segmentation) approach in Pattern Recognition for Online Handwritten Character Recognition (OHCR) in Malayalam. The subunits (Pattern Primitives) in the single stroke vowel characters in Malayalam are identified and marked with pattern primitives to obtain a reference set of characters. Segmentation of the handwritten character samples into pattern primitives is made using a Combined Approach of Ramer Douglas Peucker algorithm and Eight Direction Freeman Code as per reference set. Features that are unique in the primitives of a character are extracted. The discriminating features identified are the direction of first primitive, segment count, cusp in second primitive, crossing in third primitive, and cusp in seventh primitive. The experiments were conducted on 100 samples per character that showed exact segmentation as per the reference set. With a five dimension feature set, the study achieved a recognition rate of 95.77% for five-fold cross-validation using Support Vector Machine with RBF kernel. The study shows that the segmentation of characters into pattern primitives is an effective method to realize accurate Malayalam OHCR systems for real-time applications.
Read full abstract