Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection

David Dov,Ronen Talmon,Israel Cohen

doi:10.1109/tsp.2016.2605068

Abstract

In this paper, we address the problem of multiple view data fusion in the presence of noise and interferences. Recent studies have approached this problem using kernel methods, by relying particularly on a product of kernels constructed separately for each view. From a graph theory point of view, we analyze this fusion approach in a discrete setting. More specifically, based on a statistical model for the connectivity between data points, we propose an algorithm for the selection of the kernel bandwidth, a parameter that, as we show, has important implications on the robustness of this fusion approach to interferences. Then, we consider the fusion of audio-visual speech signals measured by a single microphone and by a video camera pointed to the face of the speaker. Specifically, we address the task of voice activity detection, i.e., the detection of speech and nonspeech segments, in the presence of structured interferences such as keyboard taps and office noise. We propose an algorithm for voice activity detection based on the audio-visual signal. Simulation results show that the proposed algorithm outperforms competing fusion and voice activity detection approaches. In addition, we demonstrate that a proper selection of the kernel bandwidth indeed leads to improved performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Signal Processing

Lead the way for us

Journal: IEEE Transactions on Signal Processing	Publication Date: Dec 15, 2016
Citations: 59

Similar Papers

A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems
Damjan Vlaj ... Zdravko Kačič
EURASIP Journal on Advances in Signal Processing | VOL. 2005
Damjan Vlaj, et. al.Damjan Vlaj ... Zdravko Kačič
30 Mar 2005
EURASIP Journal on Advances in Signal Processing | VOL. 2005

Long-term auto-correlation statistics based voice activity detection for strong noisy speech
Wei Shi ... Yi Liu
-
Wei Shi, et. al.Wei Shi ... Yi Liu
01 Jul 2014
01 Jul 2014

Robust voice activity detection algorithm based on the perceptual wavelet packet transform
Shi-Huang Chen ... Chia-Hsiang Chen
-
Shi-Huang Chen, et. al. Shi-Huang Chen ... Chia-Hsiang Chen
01 Jan 2004
01 Jan 2004

A maximum log-likelihood approach to voice activity detection
Oliver Gauci ... Carl J Debono
-
Oliver Gauci, et. al.Oliver Gauci ... Carl J Debono
01 Mar 2008
01 Mar 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Signal Processing