Layered ground truth: Conveying structural and statistical information for document image analysis and evaluation

Melissa Cote,Alexandra Branzan Albu

doi:10.1109/icpr.2016.7900137

Abstract

This paper addresses the problem of semantic overlap across document objects in the context of ground truth representation for document layout analysis. Document object categories often share primitives from a low-level perspective (e.g. regions inside bars in a bar chart resemble background), making it difficult to evaluate document layout segmentation methods based on pixel classification, as most datasets and ground truth models focus on document objects. We propose a novel ground truth model that utilizes structural and statistical pattern recognition concepts. Statistical pixel-based data derived from low-level elemental patterns are layered onto high-level structural object-based data. We also present evaluation metrics that take advantage of the layered ground truth model, allowing a contextual evaluation of pixel classification algorithms. We apply the proposed model to two recent pixel classification approaches, evaluated on business document images that exhibit a challenging mixture of textual, graphical, and pictorial elements through varied layouts. The proposed model allows to obtain very detailed, comprehensive, and intuitive information on the strengths and limitations of the evaluated approaches that would be impossible to obtain through other models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Layered ground truth: Conveying structural and statistical information for document image analysis and evaluation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An automated segmentation framework for nasal computational fluid dynamics analysis in computed tomography
Robin Huang ... Jinman Kim
Computers in Biology and Medicine | VOL. 115
Robin Huang, et. al.Robin Huang ... Jinman Kim
16 Oct 2019
Computers in Biology and Medicine | VOL. 115

Attractor learning in synchronized chaotic systems in the presence of unresolved scales
W Wiegerinck ... F M Selten
Chaos: An Interdisciplinary Journal of Nonlinear Science | VOL. 27
W Wiegerinck, et. al.W Wiegerinck ... F M Selten
09 Nov 2017
Chaos: An Interdisciplinary Journal of Nonlinear Science | VOL. 27

Ground truth model, tool, and dataset for layout analysis of historical documents
Mathias Seuret ... Eric K Ringger
-
Mathias Seuret, et. al.Mathias Seuret ... Eric K Ringger
08 Feb 2015
08 Feb 2015

Abstract 396: Beyond thresholds in precision oncology: Use of probability of benefit as function of continuous biomarker levels leads to better patient care
Cameron Mcbride ... Dean Bottino
Cancer Research | VOL. 81
Cameron Mcbride, et. al.Cameron Mcbride ... Dean Bottino
01 Jul 2021
Cancer Research | VOL. 81

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Layered ground truth: Conveying structural and statistical information for document image analysis and evaluation

Abstract

Talk to us

Similar Papers