An Experimental Technique for OCR Line and Word Segmentation using Probability Distribution Estimation

doi:10.35940/ijrte.b1273.0782s319

Abstract

Segmentation is always an important step in designing an Optical Character Recognition (OCR) of any script. In this paper, we focus on the line and word segmentation in typewritten Gurmukhi script documents. In order to perform this task, we consider OCR based methodology where several processing steps are implemented. The typewritten documents suffer from several issues such as noise, skew, and quality of the document. In this work, we present a combined pre-processing scheme where document thresholding and skew detection and correction schemes are implemented where image thresholding is obtained using Niblack’s method and skew correction is carried out using gradient histogram algorithm and uniform orientation is obtained. Later, line segmentation scheme is applied where probability density function is applied to generate the text distribution in the probability map. Here, identifying the relation of the text to the exact line is a challenging task hence, we present a 2D-Gaussian modelling which helps to identify the text boundaries in the x and y direction. The proposed methodology is applied for typewritten Gurmukhi documents and an experimental study is carried out to show that the proposed approach achieves better performance when compared with the existing techniques

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Experimental Technique for OCR Line and Word Segmentation using Probability Distribution Estimation

Abstract

Talk to us

Similar Papers

More From: International Journal of Recent Technology and Engineering

Lead the way for us

Similar Papers

A robust method for line and word segmentation in handwritten text
Abdelaali Hassaine
-
Abdelaali HassaineAbdelaali Hassaine
01 Jan 2013
01 Jan 2013

Line Segmentation Challenges in Tamil Language Palm Leaf Manuscripts
R Spurgen Ratheash* ... M Mohamed Sathik
International Journal of Innovative Technology and Exploring Engineering | VOL. 9
R Spurgen Ratheash*, et. al.R Spurgen Ratheash* ... M Mohamed Sathik
30 Nov 2019
International Journal of Innovative Technology and Exploring Engineering | VOL. 9

Line and Word Segmentation of handwritten text documents written in Gurmukhi Script using mid point detection technique
Payal Jindal ... Balkrishan Jindal
-
Payal Jindal, et. al.Payal Jindal ... Balkrishan Jindal
01 Dec 2015
01 Dec 2015

A Review of Various Line Segmentation Techniques Used in Handwritten Character Recognition
Solley Joseph ... Jossy George
-
Solley Joseph, et. al.Solley Joseph ... Jossy George
23 Jun 2022
23 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Experimental Technique for OCR Line and Word Segmentation using Probability Distribution Estimation

Abstract

Talk to us

Similar Papers

More From: International Journal of Recent Technology and Engineering