The large amount and the ubiquitous availability of multimedia information (e.g., video, audio, image, and also text documents) require efficient, effective, and automatic annotation and retrieval methods. As videos start to play an even more important role in multimedia, content-based retrieval of videos becomes an issue, especially as there should be an integrated methodology for all types of multimedia documents. Our approach for the integrated retrieval of videos, images, and text comprises three necessary steps: First, the detection and extraction of shots from a video, second, the construction of a still image from the frames in a shot. This is achieved by an extraction of key frames or a mosaicing technique. The result is a single image visualization of a shot, which in turn can be analyzed by the ImageMiner ™ 1 ImageMiner is a trademark of IBM Corp. 1 system. The ImageMiner system was developed in cooperation with IBM at the University of Bremen in the Image Processing Department of the Center for Computing Technologies. It realizes the content-based retrieval of single images through a novel combination of techniques and methods from computer vision and artificial intelligence. Its output is a textual description of an image, and thus in our case, of the static elements of a video shot. In this way, the annotations of a video can be indexed with standard text retrieval systems, along with text documents or annotations of other multimedia documents, thus ensuring an integrated interface for all kinds of multimedia documents.