Abstract
This article addresses the problem of instantaneous signal-to-noise ratio (SNR) estimation during speech activity for the purpose of improving the performance of speech enhancement algorithms. It is shown that the kurtosis of noisy speech may be used to individually estimate speech and noise energies when speech is divided into narrow bands. Based on this concept, a novel method is proposed to continuously estimate the SNR across the frequency bands without the need for a speech detector. The derivations are based on a sinusoidal model for speech and a Gaussian assumption about the noise. Experimental results using recorded speech and noise show that the model and the derivations are valid, though not entirely accurate across the whole spectrum; it is also found that many noise types encountered in mobile telephony are not far from Gaussianity as far as higher statistics are concerned, making this scheme quite effective.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.