A Model for Text Line Segmentation and Classification in Printed Documents

Jun Guo,Xin Wang

doi:10.1145/3457682.3457760

Abstract

In this paper, we propose a new model for text line segmentation and classification, which consists of convolutional and two-layer bi-directional long short-term memory (BiLSTM) networks. Trained on the synthetic text dataset, it performs excellently when predicting the real data. Without labelling every line on the real data, a generalized standard for evaluating the accuracy is proposed. We also propose a simplified IoU loss to improve the execution speed greatly. In the experiments, it achieves 98.1% line segmentation accuracy and 99.5% classification accuracy on the English fiction Pride and Prejudice by Jane Austen, and achieves 98.5% line segmentation accuracy and 99.7% classification accuracy on the The Secret Of Plato's Atlantis by John Arundell, outperforming the traditional methods. Furthermore, for 1024 × 724 input samples, it gets 2.95 FPS speed when using a Tesla K80 GPU. Index Terms—Text line segmentation, Text classification, Synthetic text, BiLSTM, Convolutional network.

Full Text