Text detection in images based on unsupervised classification of high-frequency wavelet coefficients

Bernd Freisleben ,Julinda Gllavata ,Ralph Ewerth

doi:10.1109/icpr.2004.896

Abstract

Text localization and recognition in images is important for searching information in digital photo archives, video databases and Web sites. However, since text is often printed against a complex background, it is often difficult to detect. In this paper, a robust text localization approach is presented, which can automatically detect horizontally aligned text with different sizes, fonts, colors and languages. First, a wavelet transform is applied to the image and the distribution of high-frequency wavelet coefficients is considered to statistically characterize text and non-text areas. Then, the k-means algorithm is used to classify text areas in the image. The detected text areas undergo a projection analysis in order to refine their localization. Finally, a binary segmented text image is generated, to be used as input to an OCR engine. The detection performance of our approach is demonstrated by presenting experimental results for a set of video frames taken from the MPEG-7 video test set.

Full Text