Abstract

There is an increasing demand for automated document processing techniques as the volume of electronic component documents increase. This is most prevalent in the supply chain optimization sector where vast amount of documents need to be processed and is time consuming and prone to error. Detection of tables and table structures serves as a crucial step to automate document processing. While table detection is a well investigated problem, tabular structure detection is more complex, and requires further improvements. To address this, this study proposes a deep learning model that focuses on high precision tabular cell structure detection. The proposed model creates a benchmark for the ICDAR2013 dataset cell structure with comparison to the previous state of the art table detection models as well as proposing alternative models. Our methodology approaches improving table structure detection through the detection of cells instead of row and columns for better generalization capabilities for heterogeneous table structures. Our proposed model advances prior models by improving major parts of the detection pipeline, mainly the two-stage detector, backbone, backbone architecture, and non-maximum-suppression (NMS). TabCellNet consists of Hybrid Task Cascade (HTC) with Combinational Backbone Network (CBNet), dual ResNeXt101 and Soft-NMS to achieve a precision of 89.2% and recall of 98.7% on the hand annotated ICDAR2013 cell structure dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.