Abstract

In this paper, we discuss and audio-visual approach to automatic web video categorization. To this end, we propose content descriptors which exploit audio, temporal, and color content. The power of our descriptors was validated both in the context of a classification system and as part of an information retrieval approach. For this purpose, we used a real-world scenario, comprising 26 video categories from the blip.tv media platform (up to 421 h of video footage). Additionally, to bridge the descriptor semantic gap, we propose a new relevance feedback technique which is based on hierarchical clustering. Experiments demonstrated that with this technique retrieval performance can be increased significantly and becomes comparable to that of high level semantic textual descriptors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call