Abstract
In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that make the digitization of documents viable. Since the advent of deep learning, deep learning-based object detection performance has improved many folds. This work outlines and summarizes the deep learning approaches for detecting graphical page objects in document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.
Highlights
We have presented a thorough analysis of the recent state-of-theart approaches that have approached the problem of graphical page object detection in
We provide an evaluative comparison among the state-of-the-art graphical page object detection systems
By leveraging the segmentation loss of Mask R-Convolutional Neural Networks (CNN), researchers in the document image analysis community have improved the performance of graphical page object detection systems
Summary
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. It is evident that even the state-of-the-art OCR method [6] fails to extract precise information from figures, tables, and formulas Another application of such page object detection methods is document retrieval systems [7,8], where a document image having a specific type of page object is required. The approaches leveraging these datasets have significantly improved state-of-the-art, a consolidated comparison among these approaches is missing In this survey paper, we have presented a thorough analysis of the recent state-of-theart approaches that have approached the problem of graphical page object detection in.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.