Speech enhancement plays a pivotal role in various applications, from improving the intelligibility of spoken communication in noisy environments. With the assistance of deep learning, a novel approach speech signal enhancement model is introduced in this research. The proposed LSTM model estimates the tuning factor of the Wiener filter with the aid of extracted features to obtain the de-noised speech signal. This model is structured into two phases: Training and Testing. During the training phase, Non-negative Matrix Factorization (NMF) is employed to estimate both the noise and signal spectrum from the noisy input signal. Subsequently, Empirical Mean Decomposition (EMD) features are extracted from the Wiener filter and a de-noised speech signal is obtained via processing. Additionally, bark frequency information is evaluated. In the testing phase, the LSTM model has been trained by the extracted features (EMD) via a modified wiener filter. The combination of LSTM-based temporal modeling with trained features and the adaptive Wiener filter results in significantly improved speech quality and intelligibility. Keywords— Speech Enhancement, Non-negative Matrix Factorization, Empirical Mode Decomposition, Wiener Filter.
Read full abstract