Abstract

Optical Character Recognition has emerged as an attractive research field nowadays. Lot of work has been done in Urdu script based on various approaches and diverse methodologies have been put forward based on Nastaliq font style. Urdu is written diagonally from top to bottom, the style known as Nastaliq. This feature of Nastaliq makes Urdu highly cursive and more sensitive leading to a difficult recognition problem. Due to the peculiarities of Nastaliq Style of writing, we have chosen ligature as a basic unit of recognition in order to reduce the complexity of system. The accuracy rate of recognizing ligature in Urdu text corresponds to the efficiency with which the ligatures are segmented. In addition to extracting connected components, the ligature segmentation takes into consideration various factors like baseline information, height, width, and centroid. In this paper ligature Recognition is performed by using multi-SVM (Sup-port Vector Machine) approach which gives an accuracy of 97% when 903 text images are fed to it.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call