Understanding Digital Documents Using Gestalt Properties of Isothetic Components

Shyamosree Pal,Partha Bhowmick,Arindam Biswas,Bhargab B Bhattacharya

doi:10.4018/jdls.2010070101

Abstract

This paper introduces how Gestalt properties can be used for identifying various components in a document image. That the human mind makes a holistic approach to vision rather than a disintegrated approach is shown to be useful for document analysis. Since the major constituent components textual or non-textual in a document page are arranged in a rectilinear fashion, rectilinear/isothetic decomposition of different components are made on a document page. After representing the page as a feature set of its polygonal covers corresponding to the distinct regions of interest, each polygon is iteratively decomposed into the sub-polygons tightly enclosing the corresponding sub-components to capture the overall information as well as the necessary details to the desired level of precision. Subsequently, these components and sub-components are analyzed using Gestalt laws/properties, which have been explained in detail in the context of this work. Text regions, tabular structures, and various graphic objects readily admit some of the Gestalt properties. We have tested our algorithm on several benchmark datasets, and some relevant results have been produced here to demonstrate the effectiveness and elegance of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Understanding Digital Documents Using Gestalt Properties of Isothetic Components

Abstract

Talk to us

Similar Papers

More From: International Journal of Digital Library Systems

Lead the way for us

Journal: International Journal of Digital Library Systems	Publication Date: Jan 1, 2010
Citations: 1

Similar Papers

Understanding Digital Documents Using Gestalt Properties of Isothetic Components
Shyamosree Pal ... Partha Bhowmick
-
Shyamosree Pal, et. al.Shyamosree Pal ... Partha Bhowmick
01 Jan 2012
01 Jan 2012

DeepErase: Weakly Supervised Ink Artifact Removal in Document Text Images
Yike Qi ... W Ronny Huang
-
Yike Qi, et. al.Yike Qi ... W Ronny Huang
01 Mar 2020
01 Mar 2020

Scene video text tracking based on hybrid deep text detection and layout constraint
Xihan Wang ... Zhaoqiang Xia
Neurocomputing | VOL. 363
Xihan Wang, et. al.Xihan Wang ... Zhaoqiang Xia
22 Jul 2019
Neurocomputing | VOL. 363

Deep Neural Networks Combined with STN for Multi-Oriented Text Detection and Recognition
Saif Hassan Katper ... Abdul Rehman
International Journal of Advanced Computer Science and Applications | VOL. 11
Saif Hassan Katper, et. al.Saif Hassan Katper ... Abdul Rehman
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding Digital Documents Using Gestalt Properties of Isothetic Components

Abstract

Talk to us

Similar Papers

More From: International Journal of Digital Library Systems