Abstract
There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Document Analysis and Recognition (IJDAR)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.