An audio-visual approach to web video categorization

Bogdan Emanuel Ionescu,Patrick Lambert,Ionuţ Mironică,Klaus Seyerlehner,Constantin Vertan

doi:10.1007/s11042-012-1097-x

Abstract

In this paper, we discuss and audio-visual approach to automatic web video categorization. To this end, we propose content descriptors which exploit audio, temporal, and color content. The power of our descriptors was validated both in the context of a classification system and as part of an information retrieval approach. For this purpose, we used a real-world scenario, comprising 26 video categories from the blip.tv media platform (up to 421 h of video footage). Additionally, to bridge the descriptor semantic gap, we propose a new relevance feedback technique which is based on hierarchical clustering. Experiments demonstrated that with this technique retrieval performance can be increased significantly and becomes comparable to that of high level semantic textual descriptors.

Full Text