Using Orientation Voting to Extract Text Lines with Various Mixed Directions from a Document Image

Xiaohua Zhang,Wenbo Jiang,Ning Xie,Xia Liu

doi:10.12792/jiiae.5.118

Abstract

Text line extraction from a document image is a very important task for optical character recognition, document analysis etc. In this paper, a novel approach is presented to extract text lines from a printed or handwritten document image. The document image is binarized at first, and then connected components are detected and consequently character components are collected. For clustering character components into conceptual text lines, a minimum spanning tree (MST) is built based on graph theory. An orientation voting strategy is proposed to compute conceptual consistency of links. After cutting the links with less vote, an initial clustering of character components is obtained. Polynomials are used to model straight or curved text lines, and then polynomials on the same lines are merged to represent a single conceptual text line. Finally, a post-process is applied to delete non-textual components. The experimental results demonstrate that the proposed algorithm performs plausible.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Orientation Voting to Extract Text Lines with Various Mixed Directions from a Document Image

Abstract

Talk to us

Similar Papers

More From: Journal of the Institute of Industrial Applications Engineers

Lead the way for us

Similar Papers

Segmentation of Handwritten Document Images into Text Lines
Vassilis Katsouros ... Vassilis Papavassiliou
-
Vassilis Katsouros, et. al.Vassilis Katsouros ... Vassilis Papavassiliou
19 Apr 2011
19 Apr 2011

Neural Networks for Document Image and Text Processing
Joan Pastor Pellicer
-
Joan Pastor PellicerJoan Pastor Pellicer
03 Nov 2017
03 Nov 2017

Towards Document Image Quality Assessment: A Text Line Based Framework and a Synthetic Text Line Image Dataset
Hongyu Li ... Junhua Qiu
-
Hongyu Li, et. al.Hongyu Li ... Junhua Qiu
01 Sep 2019
01 Sep 2019

Handwritten Chinese text line segmentation by clustering with distance metric learning
Fei Yin ... Cheng-Lin Liu
Pattern Recognition | VOL. 42
Fei Yin, et. al.Fei Yin ... Cheng-Lin Liu
04 Jan 2009
Pattern Recognition | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Orientation Voting to Extract Text Lines with Various Mixed Directions from a Document Image

Abstract

Talk to us

Similar Papers

More From: Journal of the Institute of Industrial Applications Engineers