Single-channel speech enhancement using improved progressive deep neural network and masking-based harmonic regeneration

Huang Ping,Wu Yafeng

doi:10.1016/j.specom.2022.10.002

Huang Ping, Wu Yafeng

https://doi.org/10.1016/j.specom.2022.10.002

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Recently, progressive learning (PL) technology has become a hot spot in the single-channel speech enhancement field. Nevertheless, the existing PL-based methods only focus on SNR variations, which may lead to noise overestimation and speech distortion. To this end, we propose a hybrid method for single-channel speech enhancement leveraging an improved progressive deep neural network (IPDNN) and a novel masking-based harmonic regeneration (MHR). First, to make a tradeoff between noise reduction and weak-energy speech distortion, we design the IPDNN architecture by guiding each hidden layer to explicitly learn an improved progressive ratio mask (IPRM) as a target with a specific weak-unvoiced component improvement and SNR gain. Then, to further compensate for the first-level enhancement results from IPDNN and obtain refined results with more harmonic components, the MHR is proposed, in which the enhanced speech is reconstructed by merging the estimated IPRMs into the conventional harmonic regeneration procedure. Finally, compared with several reference methods, our experimental results show that the proposed method can consistently improve the perceived speech quality and intelligibility for all noise types and SNR levels.

Full Text