Abstract

Camera-based text processing has attracted considerable attention and numerous methods have been proposed. However, most of these methods have focused on the scene text detection problem and relatively little work has been performed on camera-captured document images. In this paper, we present a text-line detection algorithm for camera-captured document images, which is an essential step toward document understanding. In particular, our method is developed by incorporating state estimation (an extension of scale selection) into a connected component (CC)-based framework. To be precise, we extract CCs with the maximally stable extremal region algorithm and estimate the scales and orientations of CCs from their projection profiles. Since this state estimation facilitates a merging process (bottom-up clustering) and provides a stopping criterion, our method is able to handle arbitrarily oriented text-lines and works robustly for a range of scales. Finally, a text-line/non-text-line classifier is trained and non-text candidates (e.g., background clutters) are filtered out with the classifier. Experimental results show that the proposed method outperforms conventional methods on a standard dataset and works well for a new challenging dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.