Abstract

Tabular structure detection and recognition can be a valuable step in the analysis of unstructured documents. The noisy handwritten documents we try to analyze may contain pre-printed rulings as the substrate, hand-drawn rulings, machine-printed text, handwritten text, and signatures, in addition to the tabular structures which we wish to decompose into basic cells, rows, and columns. Although work has been done to machine-printed documents, noisy handwritten documents may require modified and/or new techniques. In this work, we try to detect and decompose tabular structures into 2-D grids of table cells simultaneously. First, we detect points that help determine the physical and logical structure of tables. Then, we make use of the 2-D grid assumption to build grids of key points. Finally, we extract structural features for the Min-Cut/Max-Flow algorithm to recognize tabular structures. Experiments on 22 tables which contain 584 table cells show a cell precision of 100% and a cell recall of 93.3%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.