Recognition of Nastaliq Urdu Text using Multi-SVM

Mehvish Yasin,Herleen Kour*,Computer Science And Engineering,Shri Mata Vaishno Devi University, Katra ,Computer Science And Engineering,Shri Mata Vaishno Devi University, Katra, India , ,Dr Naveen Gondhi

doi:10.35940/ijrte.e6949.018520

Abstract

Optical Character Recognition has emerged as an attractive research field nowadays. Lot of work has been done in Urdu script based on various approaches and diverse methodologies have been put forward based on Nastaliq font style. Urdu is written diagonally from top to bottom, the style known as Nastaliq. This feature of Nastaliq makes Urdu highly cursive and more sensitive leading to a difficult recognition problem. Due to the peculiarities of Nastaliq Style of writing, we have chosen ligature as a basic unit of recognition in order to reduce the complexity of system. The accuracy rate of recognizing ligature in Urdu text corresponds to the efficiency with which the ligatures are segmented. In addition to extracting connected components, the ligature segmentation takes into consideration various factors like baseline information, height, width, and centroid. In this paper ligature Recognition is performed by using multi-SVM (Sup-port Vector Machine) approach which gives an accuracy of 97% when 903 text images are fed to it.

Full Text