Abstract

In this paper, we treated the problem of automatic skew angle estimation of scanned documents. The skew of document occurs very often, due to incorrect positioning of the documents or a manipulation error during scanning. This has negative consequences on the steps of automatic analysis and recognition of text. It is therefore essential to verify, before proceeding to these steps, the presence of skew on the document to be processed and to correct it. The difficulty of this verification is associated to the presence of graphic zones, sometimes dominant, that have a considerable impact on the accuracy of the text skew angle estimation. We also noted the importance of preprocessing to improve the accuracy and the calculation cost of skew estimation approaches. These two elements have been taken into consideration in our design and development of a new approach of skew angle estimation and correction. Our approach is based on local binarization followed by horizontal smoothing by the Run Length Smoothing Algorithm (RLSA) method, detection of horizontal contours and the Hierarchical Hough Transform (HHT). The algorithms involved in our approach have been chosen to guarantee a skew estimation: accurate, fast and robust, especially to graphic dominance and real time application. The experimental tests show the effectiveness of our approach on a representative database of the Document Image Skew Estimation Contest (DISEC) contest International Conference on Document Analysis and Recognition (ICDAR)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call