Touching Syllable Segmentation using Split Profile Algorithm

T R Ganesh Babu ,D Akbar Hussian ,B Raveendra Babu ,Leila Reddy

doi:10.1109/cnc.2010.94

Abstract

The most challenging task of a character recognition system is associated with segmentation of individual components of the script with maximum efficiency. This process is relatively easy with regard to stroke based and standard scripts. Cursive scripts are more complex possessing a large number of overlapping and touching objects, where in the statistical behavior of the topological properties are to be studied extensively for achieving highest accuracy. Certain amount of similarity exists between unconstrained hand written text as well as South Indian scripts in terms of topology, component combinations, overlapping and merging characteristics. The concept of syllables and their formulations is an additive complexity with regard to Indian scripts. In this paper the statistical behavior of the cursive script, Telugu, is presented. The topological properties in terms of zones, component combinations, behavioural aspects of syllables are studied and adopted in the segmentation process. The statistical behaviour of cursive components are evaluated. Split Profile Algorithm is proposed while handling touching components. The proposed algorithm is evaluated on different fonts and sizes. The performance of the proposed algorithm is compared with two approaches methods viz aspect ratio and syllable width approaches.

Full Text