Abstract

Recent advances in Handwritten Text Recognition and Document Layout Analysis have made it possible to convert digital images of manuscripts into electronic text. However, providing this text with the correct structure and context is still an open problem that needs to be solved to actually enable extracting the relevant information conveyed by the text. The most important structure needed for a set of text elements is their reading order. Most of the studies on the reading order problem are rule-based approaches and focus on printed documents. Much less attention has been paid so far to handwritten text documents, where the problem becomes particularly important—and challenging. In this work, we propose a new approach to automatically determine the reading order of text regions and lines in handwritten text documents. The task is approached as a sorting problem where the order-relation operator is automatically learned from examples. We experimentally demonstrate the effectiveness of our method on three different datasets at different hierarchical levels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call