Abstract

Most of Voice activity detection (VAD) methods are based on statistical model. In these methods, the noise signal is always assumed to satisfy and characterized by Gaussian distribution, while the assumption of noise does not always hold in practice and which causes that these kinds of method fail to distinguish speech from noise at low Signal-noise-ratio (SNR) level in non-stationary noise condition. For going further to improve the robustness of VAD, a enhanced speech based method is proposed. In the proposed method, the Laplacian distribution is used to model the remained noise since we find that the remained noise in enhanced speech satisfy Laplacian distribution; in addition, Gaussian mixture model is used to characterize the Discrete Fourier transform (DFT) coefficients of reconstructed speech in enhanced speech. Experimental results show that the proposed method performs better than the baseline method, especially in low SNR and non-stationary noise conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.