Abstract
In this paper, we propose to improve the dual-microphone voice activity detection (VAD) technique for which a discriminative weight training is applied to achieve optimally weighted spatial features. In our approach, we first derive the maximum a posteriori (MAP) probabilities from the spatial features such as the power level difference ratio (PLDR), phase vector, and coherence. Then, we combine each MAP probability within the minimum classification error (MCE) framework to offer an optimal VAD decision in a spectral domain. Experimental results show that the proposed dual-microphone VAD algorithm shows better performances than the conventional dual-microphone VAD methods, which solely utilize the PLDR, phase, and spectral coherence.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have