Comparison of Visual Features and Fusion Techniques in Automatic Detection of Concepts from News Video

Mika Rautiainen,Tapio Seppanen

doi:10.1109/icme.2005.1521577

Abstract

This study describes experiments on automatic detection of semantic concepts, which are textual descriptions about the digital video content. The concepts can be further used in content-based categorization and access of digital video repositories. Temporal gradient correlograms, temporal color correlograms and motion activity low-level features are extracted from the dynamic visual content of a video shot. Semantic concepts are detected with an expeditious method that is based on the selection of small positive example sets and computational low-level feature similarities between video shots. Detectors using several feature and fusion operator configurations are tested in 60-hour news video database from TRECVID 2003 benchmark. Results show that the feature fusion based on ranked lists gives better detection performance than fusion of normalized low-level feature spaces distances. Best performance was obtained by pre-validating the configurations of features and rank fusion operators. Results also show that minimum rank fusion of temporal color and structure provides comparable performance

Full Text