Abstract

Tables are commonly used for effective and compact representation of relational information across the data in diverse document classes like scientific papers, financial statements, newspaper articles, invoices, or product descriptions. However, table structure detection is a relatively simple process for humans, but recognizing precise table structure is still a computer vision challenge. Further, innumerable possible table layouts increase the risk of automatic topic modeling and understanding the capability of each table from the generic document. This paper develops the framework to recognize the table structure from the Compound Document Image(CDI). Initially, the bilateral filter is designed for image transformation, enhancing CDI quality. An improved binarization-Sauvola algorithm (IBSA) is proposed to degrade the tables with uneven illumination, low contrast, and uniform background. The morphological Thinning method extracts the line from the table. The masking approach extracts the row and column from the table. Finally, the ResNet Attention model optimized over Black Widow optimization-based mutual exclusion (BWME) is developed to recognize the table structure from the document images. The UNLV, TableBank, and ICDAR-2013 table competition datasets are used to evaluate the proposed framework’s performance. Precision and accuracy are the metrics considered for evaluating the proposed framework performance. From the experimental results, the proposed framework achieved a precision value of 96.62 and the accuracy value of 94.34, which shows the effectiveness of the proposed approach’s performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.