A Two-stage Speech Enhancement Method Based on Deep Learning Network

Moujia Ye,Hongjie Wan

doi:10.1088/1742-6596/2281/1/012018

Abstract

The conventional single-channel speech enhancement algorithms require only a small amount of computation to implement, but have problems with “musical noise” and speech distortion. The learning-based enhancement method achieves a relatively good performance, but it requires a right training data set and a large computational load. This paper proposed a method to combine the two types of algorithms. This method includes two stages. In the first stage, the Berouti spectral subtraction was used to subtract part of noise from the noisy speech. In the second stage, a deep neural network was built to remove residual noise and improve the clarity of speech obtained in the first stage. The experimental results indicate that, training with the same data set, the method proposed in this paper clearly outperforms the baseline model in PESQ and FWSegSNR.

Full Text