Automatic Table Detection, Structure Recognition and Data Extraction from Document Images

Borra Vineetha,D N D Harini,Ravi Yelesvarupu

doi:10.35940/ijitee.i9349.0710921

Borra Vineetha, D N D Harini + Show 1 more

Open Access

https://doi.org/10.35940/ijitee.i9349.0710921

Copy DOI

Abstract

In the recent advancement, the extensive usage of electronic devices to photograph and upload documents, the requirement for extracting the information present in the unstructured document images is becoming progressively intense. The major obstacle to the objective is, these images often contain information in tabular form and extracting the data from table images presents a series of challenges due to the various layouts and encodings of the tables. It includes the accurate detection of the table present in an image and eventually recognizing the internal structure of the table and extracting the information from it. Although some progress has been made in table detection, obtaining the table contents is still a challenge since this involves more fine-grained table structure (rows and columns) recognition. The digitization of critical information has to be carried out automatically since there are millions of documents. Based on the motivation that AI-based solutions are automating many processors, this work comprises three different stages: First, the table detection using Faster R-CNN algorithm. Second, table internal structure recognition process using morphology operation and refine operation and last the table data extraction using contours algorithm. The dataset used in this work was taken from the UNLV dataset.

Highlights

This paper focuses on the table detection, table internal structure recognition and data extraction in scanned documents
In order to improve table detection performance and make up for the limitations of prior methods, this paper proposes a method of table detection based on deep learning techniques
The proposed method consists of three major modules: Table Detection, Table Structure Recognition and Table Data Extraction present in the table

Summary

Introduction

Tables are widely used in many domains to present and communicate structured information to human readers since tables enable readers to search, compare and understand facts and draw conclusions rapidly. Automatically detecting tables from documents and extracting the information contained in tables are of significant importance in the field of document recognition and analysis and have attracted a lot of research efforts in the past few decades. This paper focuses on the table detection, table internal structure recognition and data extraction in scanned documents. Revised Manuscript received on July 17, 2021.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Innovative Technology and Exploring Engineering	Publication Date: Jul 30, 2021
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Automatic Table Detection, Structure Recognition and Data Extraction from Document Images

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Innovative Technology and Exploring Engineering

Lead the way for us

Similar Papers

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images
Shubham Singh Paliwal ... Rohit Rahul
-
Shubham Singh Paliwal, et. al.Shubham Singh Paliwal ... Rohit Rahul
01 Sep 2019
01 Sep 2019

DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images
Sebastian Schreiber ... Sheraz Ahmed
-
Sebastian Schreiber, et. al.Sebastian Schreiber ... Sheraz Ahmed
01 Nov 2017
01 Nov 2017

CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
Devashish Prasad ... Kshitij Kapadni
-
Devashish Prasad, et. al.Devashish Prasad ... Kshitij Kapadni
01 May 2020
01 May 2020

Robust Table Detection and Structure Recognition from Heterogeneous Document Images
Chixiang Ma ... Qiang Huo
Pattern Recognition | VOL. 133
Chixiang Ma, et. al.Chixiang Ma ... Qiang Huo
29 Aug 2022
Pattern Recognition | VOL. 133

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Table Detection, Structure Recognition and Data Extraction from Document Images

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Innovative Technology and Exploring Engineering