Abstract
The dual-microphone voice activity detection (VAD) technique is proposed by applying discriminative weight training to achieve optimal weighting of spatial features available within the dual-microphone VAD. Since the motivation behind our method is to use the relevant spatial information available from the two microphones, we employ the phase difference, coherence, and power level difference ratio (PLDR) as a feature vector, and then use this feature vector to derive the maximum a posteriori (MAP) probabilities. Then, we combine each MAP probability based on a discriminative weight training, i.e., the minimum classification error (MCE) method to offer an optimal VAD decision in a spectral domain, which successfully represents the dynamic evolution of speech over time even in the non-stationary noise environments. The proposed dual-microphone VAD algorithm outperforms conventional dual-microphone VAD methods based on only single feature among the PLDR, phase difference, and spectral coherence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.