Abstract

With the passing of history, precious cultural heritage was left behind to tell ancient stories, especially those in the form of written documents. In this paper, a weakly supervised segmentation system with recognition-guided information on attention area, is proposed for high-precision historical document segmentation under strict intersection-over-union (IoU) requirements. We formulate the character segmentation problem from Bayesian decision theory perspective and propose boundary box segmentation (BBS), recognition-guided BBS (Rg-BBS), and recognition-guided attention BBS (Rg-ABBS), progressively, to search for the segmentation path. Furthermore, a novel judgment gate mechanism is proposed to train a high-performance character recognizer in an incremental weakly supervised learning manner. The proposed Rg-ABBS method is shown to substantially reduce time consumption while maintaining sufficiently high precision of the segmentation result by incorporating both character recognition knowledge and line-level annotation. Experiments show that the proposed Rg-ABBS system significantly outperforms traditional segmentation methods as well as deep-learning-based instance segmentation and detection methods under strict IoU requirements.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.