Stochastic language models for style-directed layout analysis of document images

T Kanungo,Song Mao Song Mao

doi:10.1109/tip.2003.811487

Stochastic language models for style-directed layout analysis of document images

T Kanungo, Song Mao Song Mao

Open Access

https://doi.org/10.1109/tip.2003.811487

Copy DOI

Journal: IEEE Transactions on Image Processing	Publication Date: May 1, 2003
Citations: 44

Affiliation: IBM Research - Almaden, United States National Library of Medicine

#Stochastic Regular Grammar #Algorithm's Objective Function + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Image segmentation is an important component of any document image analysis system. While many segmentation algorithms exist in the literature, very few i) allow users to specify the physical style, and ii) incorporate user-specified style information into the algorithm's objective function that is to be minimized. We describe a segmentation algorithm that models a document's physical structure as a hierarchical structure where each node describes a region of the document using a stochastic regular grammar. The exact form of the hierarchy and the stochastic language is specified by the user, while the probabilities associated with the transitions are estimated from groundtruth data. We demonstrate the segmentation algorithm on images of bilingual dictionaries.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: IEEE Transactions on Image Processing

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.