Template Matching Based Probabilistic Optical Character Recognition for Urdu Nastaliq Script

Qaiser Abbas Qaiser Abbas

doi:10.54692/lgurjcsit.2021.0502207

Abstract

This paper presents a technique for optical recognition of Urdu characters using template matching based on a probabilistic N-Gram language model. Dataset used has the collection of both printed and typed text. This model is able to perform three types of segmentations including line, ligature and character using horizontal projection, connected component labeling, corners and pointers techniques, respectively. A separate stochastic lexicon is built from a collected corpus, which contains the probability values of grams. By using template matching and the N-Gram language model, our study predicts complete segmented words with the promising result, particularly in case of bigrams. It outperforms three out of four existing models with an accuracy rate of 97.33%. Results achieved on our test dataset are encouraging in one perspective but provide direction to work for further improvement in this model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Template Matching Based Probabilistic Optical Character Recognition for Urdu Nastaliq Script

Abstract

Talk to us

Similar Papers

More From: Lahore Garrison University Research Journal of Computer Science and Information Technology

Lead the way for us

Similar Papers

Probabilistic Modeling of Joint-context in Distributional Similarity
Oren Melamud ... Deniz Yuret
-
Oren Melamud, et. al.Oren Melamud ... Deniz Yuret
01 Jan 2014
01 Jan 2014

Sliding text recognition in broadcast news
Erinc Dikici ... Murat Saraclar
-
Erinc Dikici, et. al.Erinc Dikici ... Murat Saraclar
01 Apr 2008
01 Apr 2008

Model design for grammatical error identification in software requirements specification using statistics and rule-based techniques
F P Putra ... D Enda
Journal of Physics: Conference Series | VOL. 1450
F P Putra, et. al.F P Putra ... D Enda
01 Feb 2020
Journal of Physics: Conference Series | VOL. 1450

Bangla License Plate Detection, Recognition and Authentication with Morphological Process and Template Matching
Md Ashraful Islam ... Gulam Mahfuz Chowdhury
-
Md Ashraful Islam, et. al.Md Ashraful Islam ... Gulam Mahfuz Chowdhury
21 May 2021
21 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Template Matching Based Probabilistic Optical Character Recognition for Urdu Nastaliq Script

Abstract

Talk to us

Similar Papers

More From: Lahore Garrison University Research Journal of Computer Science and Information Technology