Multi-Oriented English Text Line Extraction Using Background and Foreground Information

Partha Pratim Roy,Umapada Pal,Fumitaka Kimura,Josep Lladós

doi:10.1109/das.2008.83

Abstract

In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OCR of such documents we need to extract individual text lines from the documents. Extraction of individual text lines from multi-oriented and/or curved text document is a difficult problem. In this paper, we propose a novel method to extract individual text lines from such document pages and the method is based on the foreground and background information of the characters of the text. To take care of background information, water reservoir concept is used here. In the proposed scheme at first, individual components are detected and grouped into 3-character clusters using their inter-component distance, size and positional information. Applying concept of graph, initial 3-character clusters are merged to have larger cluster group. Using inter-character background information, orientations of the extreme characters of a larger cluster are decided and based on these orientation, two candidate regions are formed from the cluster. Finally, with the help of these candidate regions, individual lines are extracted. From the experiment, we obtained encouraging result.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Oriented English Text Line Extraction Using Background and Foreground Information

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Text line extraction in graphical documents using background and foreground information
Partha Pratim Roy ... Umapada Pal
International Journal on Document Analysis and Recognition (IJDAR) | VOL. 15
Partha Pratim Roy, et. al.Partha Pratim Roy ... Umapada Pal
30 Jun 2011
International Journal on Document Analysis and Recognition (IJDAR) | VOL. 15

Multi-oriented Text Recognition in Graphical Documents Using HMM
Partha Pratim Roy ... Umapada Pal
-
Partha Pratim Roy, et. al.Partha Pratim Roy ... Umapada Pal
01 Apr 2014
01 Apr 2014

Multioriented and Curved Text Lines Extraction From Indian Documents
U Pal ... P.P Roy
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) | VOL. 34
U Pal, et. al.U Pal ... P.P Roy
01 Aug 2004
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) | VOL. 34

A novel approach to text line and word segmentation on odia printed documents
D Senapati ... M Nayak
-
D Senapati, et. al.D Senapati ... M Nayak
01 Jul 2012
01 Jul 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Oriented English Text Line Extraction Using Background and Foreground Information

Abstract

Talk to us

Similar Papers