Abstract

This paper provides a novel deep neural networks (DNN) based speech enhancement method using multi-band excitation (MBE) model. Generally, the proposed system contains two stages, namely training stage and enhancing stage. In the training stage, two DNNs with different targets are trained. The training targets are harmonic magnitude and band difference function of clean speech, respectively. The input feature for two DNNs is log-power spectra (LPS) of noisy speech. In the enhancing stage, using the output of DNNs and online estimated pitch period, the enhanced speech can be obtained by MBE speech synthesis. Using the proposed method, the parameters of MBE model can be accurately estimated to synthesize the enhanced speech with the high quality. At the same time, the noise between the harmonics is effectively eliminated. The experiments show that the proposed method outperforms the reference methods for speech quality and intelligibility.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call