Understanding Unsupervised Deep Learning for Text Line Segmentation

Ahmad Droby,Berat Kurar Barakat,Reem Alaasam,Raid Saabni,Boraq Madi,Jihad El-Sana

doi:10.3390/app12199528

Ahmad Droby, Berat Kurar Barakat + Show 4 more

Open Access

https://doi.org/10.3390/app12199528

Copy DOI

Abstract

We propose an unsupervised feature learning approach for segmenting text lines of handwritten document images with no labelling effort. Humans can easily group local text line features to global coarse patterns. We leverage this coherent visual perception of text lines as a supervising signal by formulating the feature learning as a global pattern differentiation task. The machine is trained to detect whether a document patch contains a similar global text line pattern with its identity or neighbours, and a different global text line pattern with its 90-degree-rotated identity or neighbours. Clustering the central windows of document image patches using their extracted features, forms blob lines which strike through the text lines. The blob lines guide an energy minimization function for extracting text lines in a binary image and guide a seam carving function for detecting baselines in a colour image. In identifying the aspect of the input patch that supports the actual prediction and clustering, we contribute toward the understanding of input patch functionality. We evaluate the method on several variants of text line segmentation datasets to demonstrate its effectiveness, visualize what it has learned, and enable it to comprehend its clustering strategy from a human perspective.

Full Text