Cursive script identification using Gabor features and SVM classifier

P Sivakumar,Kilvisharam Oziuddeen Mohammed Aarif

doi:10.1504/ijcaet.2020.10027369

Abstract

Script identification is one of a challenging segment of optical character recognition system for the bilingual or multilingual document image. Significant research work has been noted on script identification in the last two decades which highly concentrated on natural languages like Latin, Chinese, Hindi, French and so forth. Very little efforts are made on script identification of cursive languages like Arabic, Urdu, Pashto, etc. Most of the Urdu ancient literature which is yet to be digitised includes both Urdu and Arabic text. In this paper, we present a script identification of Urdu and Arabic text at word level using Gabor features with suitable orientation and frequencies. The proposed model is trained using support vector machine (SVM) classifier and the results achieved are very promising.

Full Text