Abstract

ABSTRACT Skew detection is one the first operations to be applied to scanned documents when converting data to a digital format. Itsaim is to align an image before further processing because text segmentation and recognition methods require properlyaligned text lines. Here we represent a new method of skew detection based on the principal component analysis and multi-resolution processing. A combination of these two approaches allows us to increase correctness of skew detection whenfinding an arbitrary skew angle from 0 to 179 degrees with low-resolution images of 25-50 dots per inch (dpi). The obtainedresults advantageously differentiate our technique from the methods determining a skew at a single resolution underconditions above.Keywords: document skew detection, principal component analysis, image processing 1. INTRODUCTION Skew detection is one of operations necessary to convert data from a paper to digital representation. After documentscanning or copying, a non-zero skew often appears due to imprecise document alignment. Often editors or documentdesigners can intentionally introduce skewed text lines in order to emphasize some important details. In this case, a skewangle can be arbitrary. Therefore, a purpose of skew detection is to find an angle leading to the aligned image when rotatingby it. This is necessary for the next processing (text segmentation and recognition) because these operations are verysensitive to a skew. Also data retrieval systems work well only with properly aligned images. Here we only consider thedocuments containing Roman scripts and assume that there is the same skew for all image blocks. Hence, in this case, oneshould detect the angle resulting to a horizontal orientation of text lines when rotating the image by this angle. The localskew detection for documents containing differently oriented blocks [1,2] is not matter of this paper.Nowadays there are many skew detection methods [1-16, 1 8-31]. The detectable angle range can roughly divide them intotwo classes. The techniques belonging to the first group [1-16,18-26] can detect a skew only for a limited angle range thatusually varies from [-5°,+5°] to [-45°,+45°] .

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.