Audio-visual grouplet

Wei Jiang,Alexander C Loui

doi:10.1145/2072298.2072316

Abstract

We investigate general concept classification in unconstrained videos by joint audio-visual analysis. A novel representation, the Audio-Visual Grouplet (AVG), is extracted by studying the statistical temporal audio-visual interactions. An AVG is defined as a set of audio and visual codewords that are grouped together according to their strong temporal correlations in videos. The AVGs carry unique audio-visual cues to represent the video content, based on which an audio-visual dictionary can be constructed for concept classification. By using the entire AVGs as building elements, the audio-visual dictionary is much more robust than traditional vocabularies that use discrete audio or visual codewords. Specifically, we conduct coarse-level foreground/background separation in both audio and visual channels, and discover four types of AVGs by exploring mixed-and-matched temporal audio-visual correlations among the following factors: visual foreground, visual background, audio foreground, and audio background. All of these types of AVGs provide discriminative audio-visual patterns for classifying various semantic concepts. We extensively evaluate our method over the large-scale Columbia Consumer Video set. Experiments demonstrate that the AVG-based dictionaries can achieve consistent and significant performance improvements compared with other state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Audio-visual grouplet

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Video concept detection by audio-visual grouplets
Wei Jiang ... Alexander C. Loui
International Journal of Multimedia Information Retrieval | VOL. 1
Wei Jiang, et. al.Wei Jiang ... Alexander C. Loui
07 Sep 2012
International Journal of Multimedia Information Retrieval | VOL. 1

Grouplet-Based Distance Metric Learning for Video Concept Detection
Wei Jiang ... Alexander C Loui
-
Wei Jiang, et. al.Wei Jiang ... Alexander C Loui
01 Jul 2012
01 Jul 2012

Dimensional perception of a ‘smiling McGurk effect’
Ilaria Torre ... Rachel Mcdonnell
-
Ilaria Torre, et. al.Ilaria Torre ... Rachel Mcdonnell
28 Sep 2021
28 Sep 2021

Mobile video concept classification
Wei Jiang
International Journal of Multimedia Information Retrieval | VOL. 2
Wei JiangWei Jiang
11 Dec 2012
International Journal of Multimedia Information Retrieval | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Audio-visual grouplet

Abstract

Talk to us

Similar Papers