A Methodological Study of Document Layout Analysis

Chunhu Zhang,Askar Hamdulla,Mayire Ibrayim

doi:10.1109/vrhciai57205.2022.00009

Abstract

Document layout analysis is an important part of document information processing systems, which is essential for many applications such as optical character recognition (OCR) systems, machine translation, information retrieval, and document structured data extraction, as well as for digitizing paper documents and classifying and identifying document image regions. Document-like images contain a wealth of information, and in order to automatically extract and classify regions of interest in document images, the document images are programmed to analyze the layout content for subsequent OCR and automatic transcription. However, the proposed algorithms still have more limitations due to various document layouts and variations of block positions, inter-class and within-class variations, and background noise. This paper first summarizes the traditional learning algorithms based on tour smoothing and segmentation projection, deep learning algorithms using recurrent convolutional neural networks and twin networks, and algorithms combining traditional learning and deep learning proposed in recent years. The current mainstream algorithms and common datasets in experiments for deep learning and their access are highlighted. As well as the comparison of some algorithms on benchmark datasets, and some experimental results with good robustness are given. Finally, the future research areas are prospected for further development.

Full Text