A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme

Masakiyo Fujimoto Masakiyo Fujimoto,Tomohiro Nakatani Tomohiro Nakatani,Kentaro Ishizuka Kentaro Ishizuka

doi:10.1109/icassp.2008.4518641

Masakiyo Fujimoto Masakiyo Fujimoto, Tomohiro Nakatani Tomohiro Nakatani + Show 1 more

Open Access

PDF Available

https://doi.org/10.1109/icassp.2008.4518641

Copy DOI

Export

Save

Cite

Publication Date: Mar 1, 2008

Citations: 37

Affiliation: NTT (Japan)

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This paper addresses the problem of voice activity detection (VAD) in noisy environments. The VAD method proposed in this paper integrates multiple speech features and a signal decision scheme, namely the speech periodic to aperiodic component ratio and a switching Kalman filter. The integration is carried out by using the weighted sum of likelihoods outputted from each VAD (stream). The stream weight is decided adaptively each short time frame. The evaluation is carried out by using a VAD evaluation framework, CENSREC1-C. The evaluation results revealed that the proposed method significantly outperforms the baseline results of CENSREC-1-C as regards VAD accuracy in real environments. In addition, we carried out speech recognition evaluations by using detected speech signals, and confirmed that the proposed method contributes to an improvement in speech recognition accuracy.

Full Text