Text Line Extraction in Historical Documents Using Mask R-CNN

Ahmad Droby,Boraq Madi,Jihad El-Sana,Berat Kurar Barakat,Reem Alaasam,Irina Rabaev

doi:10.3390/signals3030032

Ahmad Droby, Boraq Madi + Show 4 more

Open Access

https://doi.org/10.3390/signals3030032

Copy DOI

Abstract

Text line extraction is an essential preprocessing step in many handwritten document image analysis tasks. It includes detecting text lines in a document image and segmenting the regions of each detected line. Deep learning-based methods are frequently used for text line detection. However, only a limited number of methods tackle the problems of detection and segmentation together. This paper proposes a holistic method that applies Mask R-CNN for text line extraction. A Mask R-CNN model is trained to extract text lines fractions from document patches, which are further merged to form the text lines of an entire page. The presented method was evaluated on the two well-known datasets of historical documents, DIVA-HisDB and ICDAR 2015-HTR, and achieved state-of-the-art results. In addition, we introduce a new challenging dataset of Arabic historical manuscripts, VML-AHTE, where numerous diacritics are present. We show that the presented Mask R-CNN-based method can successfully segment text lines, even in such a challenging scenario.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Signals	Publication Date: Aug 4, 2022
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Text Line Extraction in Historical Documents Using Mask R-CNN

Abstract

Talk to us

Similar Papers

More From: Signals

Lead the way for us

Similar Papers

Text Line Extraction Using Fully Convolutional Network and Energy Minimization
Berat Kurar Barakat ... Jihad El-Sana
-
Berat Kurar Barakat, et. al.Berat Kurar Barakat ... Jihad El-Sana
01 Jan 2020
01 Jan 2020

A Robust and Binarization-Free Approach for Text Line Detection in Historical Documents
Tobias Gruuening ... Gundram Leifert
-
Tobias Gruuening, et. al.Tobias Gruuening ... Gundram Leifert
01 Nov 2017
01 Nov 2017

Human Reading Knowledge Inspired Text Line Extraction
Liuan Wang ... Seiichi Uchida
Cognitive Computation | VOL. 10
Liuan Wang, et. al.Liuan Wang ... Seiichi Uchida
02 Aug 2017
Cognitive Computation | VOL. 10

Combining Learned Script Points and Combinatorial Optimization for Text Line Extraction
Joan Pastor-Pellicer ... Angelika Garz
-
Joan Pastor-Pellicer, et. al.Joan Pastor-Pellicer ... Angelika Garz
22 Aug 2015
22 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text Line Extraction in Historical Documents Using Mask R-CNN

Abstract

Talk to us

Similar Papers

More From: Signals