Abstract

The conventional single-channel speech enhancement algorithms require only a small amount of computation to implement, but have problems with “musical noise” and speech distortion. The learning-based enhancement method achieves a relatively good performance, but it requires a right training data set and a large computational load. This paper proposed a method to combine the two types of algorithms. This method includes two stages. In the first stage, the Berouti spectral subtraction was used to subtract part of noise from the noisy speech. In the second stage, a deep neural network was built to remove residual noise and improve the clarity of speech obtained in the first stage. The experimental results indicate that, training with the same data set, the method proposed in this paper clearly outperforms the baseline model in PESQ and FWSegSNR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call