Audio-Visual Fusion for Film Database Retrieval and Classification

Paisarn Muneesawang,Ning Zhang,Ling Guan

doi:10.1007/978-3-319-11782-9_10

Abstract

This chapter presents the techniques for the characterization and fusion of audio and visual content in videos, and demonstrates their applications in movie database retrieval. In the audio domain, a study is conducted on the peaky nature of the distribution of wavelet coefficients of an audio signal, which cannot be effectively modeled by a single distribution. Thus, a new modeling method based on a Laplacian mixture model is studied for analyzing audio content and extracting audio features. The dimension of the indexed features is low, which is important for the retrieval efficiency of the system in terms of response time. Together with the audio feature, the visual feature is extracted by template frequency modeling. Both features are referred to as perceptual features. Then, a learning algorithm for audiovisual fusion is presented. Specifically, the two features are fused at the late fusion stage and input into a support vector machine to learn semantic concepts from a given video database. Based on the experimental results, the current system implementing the support vector machine-based fusion technique achieves high classification accuracy when applied to a large volume database containing Hollywood movies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Audio-Visual Fusion for Film Database Retrieval and Classification

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Application of Laplacian Mixture Model to Image and Video Retrieval
Tahir Amin
-
Tahir AminTahir Amin
08 Jun 2021
08 Jun 2021

Application of Laplacian Mixture Model to Image and Video Retrieval
Tahir Amin
-
Tahir AminTahir Amin
08 Jun 2021
08 Jun 2021

Classifying derivative works with search, text, audio and video features
Jordan B L Smith ... Masahiro Hamasaki
-
Jordan B L Smith, et. al.Jordan B L Smith ... Masahiro Hamasaki
01 Jul 2017
01 Jul 2017

End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis
Muhammad Muzammel ... Alice Othmani
Computer Methods and Programs in Biomedicine | VOL. 211
Muhammad Muzammel, et. al.Muhammad Muzammel ... Alice Othmani
28 Sep 2021
Computer Methods and Programs in Biomedicine | VOL. 211

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Audio-Visual Fusion for Film Database Retrieval and Classification

Abstract

Talk to us

Similar Papers