Abstract

This paper introduces a new speech enhancement algorithm based on the adaptive threshold of intrinsic mode functions (IMFs) of noisy signal frames extracted by empirical mode decomposition. Adaptive threshold values are estimated by using the gamma statistical model of Teager energy operated IMFs of noisy speech and estimated noise based on symmetric Kullback–Leibler divergence. The enhanced speech signal is obtained by a semisoft thresholding function, which is utilized by threshold IMF coefficients of noisy speech. The method is tested on the NOIZEUS speech database and the proposed method is compared with wavelet-shrinkage and EMD-shrinkage methods in terms of segmental SNR improvement (SegSNR), weighted spectral slope (WSS), and perceptual evaluation of speech quality (PESQ). Experimental results show that the proposed method provides a higher SegSNR improvement in dB, lower WSS distance, and higher PESQ scores than wavelet-shrinkage and EMD-shrinkage methods. The proposed method shows better performance than traditional threshold-based speech enhancement approaches from high to low SNR levels.

Highlights

  • The corruption of speech signals by environmental noise negatively affects the quality and intelligibility of speech, resulting in a severe decrease in the performance of the applications

  • The noisy speech signals and estimated noise were decomposed into 7 intrinsic mode functions (IMFs) by the empirical mode decomposition (EMD) method

  • In order to obtain the enhanced speech, the semisoft thresholding function was used for the proposed method and the soft thresholding function was used for wavelet-shrinkage and EMD-shrinkage methods

Read more

Summary

Introduction

The corruption of speech signals by environmental noise negatively affects the quality and intelligibility of speech, resulting in a severe decrease in the performance of the applications. Various methods have been developed to reduce noise while maintaining the quality and intelligibility in speech enhancement problems. These methods are divided into three basic groups based on the time, frequency, and time-frequency domains. The spectral subtraction approach is historically one of the first algorithms based on a simple principle that is recommended for noise reduction. In this method, the speech signals are degraded by unwanted disturbing sound known as “musical noise”.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.