Text Localization and Text Segmentation in Images, Videos and Web Pages

Rainer Lienhart,Axel Wernicke

doi:10.1007/978-3-642-59802-9_21

Abstract

Multimedia data is the fastest growing media type in many areas and especially on the Internet. The text in videos is one powerful high-level index for retrieval. Efficient indexing and retrieval of digital video is an important aspect of multimedia databases. Detecting, extracting and recognizing text can build such an index. Segmenting and recognizing text in the non-text parts o f w e b pages is also a very important issue. More and more w e b pages present text in images. Existing text segmentation and text recognition algorithms cannot extract the text. Thus, all existing search engines cannot index the content o f image-rich web pages properly! A new, robust, and true multi-resolution approach to localizing and segmenting text in videos and images is proposed in this paper. It has been tested extensively on large variety o f video sizes such 352×240 up to 1920×1280 and a large representative set of video sequences such as home videos, newscast, title sequences and commercials as well as images.

Full Text