Abstract

This paper presents a novel approach for estimating autoregressive (AR) model parameters using deep neural network (DNN) in the AR-Wiener filtering speech enhancement. Unlike conventional DNN that predicts one kind of target, the DNN used in this paper is trained to predict the AR model parameters of speech and noise simultaneously at offline stage. We train this network by minimizing the Euclidean distance between the output of DNN and the AR model parameters of clean speech and noise. At online stage, the acoustic features are first extracted from noisy speech as the input of the DNN. Then, AR model parameters of speech and noise are estimated by the DNN simultaneously. Finally, the Wiener filter is constructed by the AR model parameters of speech and noise. However, the AR model parameters only models the spectral shape not the spectral details, there are still some residual noise between the harmonics. In order to solve this problem, we introduce the speech-presence probability (SPP), that is, in the test stage, the SPP is estimated and is used to update the Wiener filter. The experimental results show that our approach has higher performance compared with some existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call