Statistical Estimation using the SNR uncertainty technique is one of the effective Speech Enhancement (SE) algorithms. In this method, the Gain function plays a crucial role and it depends on the proper selection of the smoothing and threshold constants. In the literature, the values of these constants have been optimized by considering a single objective function of maximization of speech quality for a specific noise condition. But in practice, the noise magnitude varies and one set of optimized parameters cannot always provide consistent performance. In this paper, this problem has been addressed and solved in three steps. The first step is multi-objective optimization to find the best set of values of smoothing and threshold constants at different noise levels by considering the objectives of maximization of speech quality, intelligibility, and minimization of mean square error. The second step is the classification of the noisy speech into four SNR levels such as 0 dB, 5 dB, 10 dB, and 15 dB by using appropriate audio features. The values obtained in steps one and two are stored and in the third step, when the unknown noisy speech signal is to be enhanced the best-chosen values of the smoothing and threshold constants are selected for this task. Finally, the performance of the proposed method is evaluated in two different speech datasets. Then, comparative performance and statistical analysis are carried out using six other standard SE algorithms and it is demonstrated that the proposed approach provides superior performance than others.
Read full abstract