Text region extraction in a document image based on the Delaunay tessellation

Yi Xiao,Hong Yan

doi:10.1016/s0031-3203(02)00082-1

Text region extraction in a document image based on the Delaunay tessellation

Yi Xiao, Hong Yan

https://doi.org/10.1016/s0031-3203(02)00082-1

Copy DOI

Journal: Pattern Recognition	Publication Date: Jun 4, 2002
Citations: 58

Affiliation: University of Sydney, University of Hong Kong

#Delaunay Triangulation #Text Region Extraction + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

In this paper, Delaunay triangulation is applied for the extraction of text areas in a document image. By representing the location of connected components in a document image with their centroids, the page structure is described as a set of points in two-dimensional space. When imposing Delaunay triangulation on these points, the text regions in the Delaunay triangulation will have distinguishing triangular features from image and drawing regions. For analysis, the Delaunay triangles are divided into four classes. The study reveals that specific triangles in text areas can be clustered together and identified as text body. Using this method, text regions in a document image containing fragments can also be recognized accurately. Experiments show the method is also very efficient.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Pattern Recognition

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.