A Survey of Graphical Page Object Detection with Deep Neural Networks

Jwalin Bhatt,Muhammad Zeshan Afzal,Didier Stricker,Khurram Azeem Hashmi

doi:10.3390/app11125344

Jwalin Bhatt, Muhammad Zeshan Afzal + Show 2 more

Open Access

https://doi.org/10.3390/app11125344

Copy DOI

Abstract

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that make the digitization of documents viable. Since the advent of deep learning, deep learning-based object detection performance has improved many folds. This work outlines and summarizes the deep learning approaches for detecting graphical page objects in document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.

Highlights

We have presented a thorough analysis of the recent state-of-theart approaches that have approached the problem of graphical page object detection in
We provide an evaluative comparison among the state-of-the-art graphical page object detection systems
By leveraging the segmentation loss of Mask R-Convolutional Neural Networks (CNN), researchers in the document image analysis community have improved the performance of graphical page object detection systems

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. It is evident that even the state-of-the-art OCR method [6] fails to extract precise information from figures, tables, and formulas Another application of such page object detection methods is document retrieval systems [7,8], where a document image having a specific type of page object is required. The approaches leveraging these datasets have significantly improved state-of-the-art, a consolidated comparison among these approaches is missing In this survey paper, we have presented a thorough analysis of the recent state-of-theart approaches that have approached the problem of graphical page object detection in.

Discussion and Conclusion

Traditional Approaches

Methodologies

Method

Faster R-CNN

Mask R-CNN

Deformable Convolutions

Dynamic Programming Based Approach

Fully Convolutional Neural Networks

Datasets

ICDAR-17 POD

PubLayNet

DocBank

Marmot

TableBank

IIIT-AR-13k

DeepFigures

ICDAR-13

4.10. ICDAR-2019

Evaluation

Precision

Intersection Over Union

Evaluation for Table Detection

Evaluation for Figure Detection

Evaluations for Formula Detection

Discussion and Conclusions

Difficulties and Challenges

Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Jun 9, 2021
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Survey of Graphical Page Object Detection with Deep Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

A comprehensive and systematic look up into deep learning based object detection techniques: A review
Vipal Kumar Sharma ... Roohie Naaz Mir
Computer Science Review | VOL. 38
Vipal Kumar Sharma, et. al.Vipal Kumar Sharma ... Roohie Naaz Mir
11 Sep 2020
Computer Science Review | VOL. 38

Event-Based Vision Processing in Deep Neural Networks

-

04 Mar 2021
04 Mar 2021

Deep Bayesian active learning with image data
...
-
, et. al. ...
27 Nov 2017
27 Nov 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Graphical Page Object Detection with Deep Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences