A novel approach to text line and word segmentation on odia printed documents

D Senapati,S Rout,M Nayak

doi:10.1109/icccnt.2012.6396063

Abstract

The OCR is an electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. The Optical Character System is available for various languages, such as English, Chinese and Arabic script, but it is commercially not available for Odia script. We have taken a step to develop OCR system for Odia language. The OCR is popular for its various applications potentials in banks, library automation, post-offices, defense organizations and language processing. Line and Word segmentation is one of the important steps of OCR system. The accuracy of the word/character recognition is directly affected by the correctness/ incorrectness of text-line and word segmentation. In this paper we have proposed a robust method for segmentation of individual text lines of Odia printed document image file. The segmented text line is the input for the word segmentation method which produces segmented words. Both foreground and background information are used in the proposed method. We have tested our method on scanned Odia scripts as well as some multi-script documents and obtained encouraging result. This technique is based on the intensities of pixels in the document.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel approach to text line and word segmentation on odia printed documents

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A robust method for line and word segmentation in handwritten text
Abdelaali Hassaine
-
Abdelaali HassaineAbdelaali Hassaine
01 Jan 2013
01 Jan 2013

Text line and word segmentation of handwritten documents
G Louloudis ... C Halatsis
Pattern Recognition | VOL. 42
G Louloudis, et. al.G Louloudis ... C Halatsis
04 Jan 2009
Pattern Recognition | VOL. 42

A comprehensive evaluation methodology for noisy historical document recognition techniques
Nikolaos Stamatopoulos ... Basilis Gatos
-
Nikolaos Stamatopoulos, et. al.Nikolaos Stamatopoulos ... Basilis Gatos
23 Jul 2009
23 Jul 2009

An approach to analysis of arabic text documents into text lines, words, and characters
Hakim A Abdo ... Ahmed Abdu
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 26
Hakim A Abdo, et. al.Hakim A Abdo ... Ahmed Abdu
01 May 2022
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel approach to text line and word segmentation on odia printed documents

Abstract

Talk to us

Similar Papers