One of the most significant challenges in the real time speech processing applications is the elimination of noise in the corrupted speech data. This noise can significantly impact the efficacy/performance of the applications of speech processing. However, developing a robust noise reduction algorithm is crucial for improving the accuracy of automatic speech recognition (ASR) and other speech processing systems under uncontrolled conditions. In this paper, we propose an algorithm to reduce the background noise in the degraded speech data under highly non-stationary conditions. The proposed optimal smoothing and minima controlled (OSMC) technique uses recursive averaging to enhance degraded speech data. Initially, a smoothed periodogram and local minima of the degraded speech data are computed and determined the time-frequency dependent threshold factor. The ratio of smoothed periodogram to local minima is used to find the active regions of speech in the degraded speech data by adapting the Bayesian minimum cost decision rule. To calculate the estimated noise spectrum for each frequency bin, a time-frequency smoothing factors are used. The perceptual evaluation of speech quality (PESQ) and normalized covariance metric (NCM) are used to evaluate the speech quality and intelligibility of the proposed technique over competing algorithms after speech enhancement. The experimental results demonstrated that the proposed algorithm has given a significant improvement in terms of average values of PESQ by 15.03% and 16.71% and NCM by 3.45% and 5.73% for NOIZEUS and Kannada speech databases at 5 dB and 10 dB respectively, over unprocessed speech under highly non-stationary noisy environments.
Read full abstract