Dual branch deep interactive UNet for monaural noisy-reverberant speech enhancement

Zehua Zhang,Shiyun Xu,Xuyi Zhuang,Yukun Qian,Mingjiang Wang

doi:10.1016/j.apacoust.2023.109574

Zehua Zhang, Shiyun Xu + Show 3 more

Open Access

https://doi.org/10.1016/j.apacoust.2023.109574

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Noise and reverberation can severely degrade speech quality and intelligibility, so many deep neural network-based noisy-reverberant speech enhancement methods have been proposed, among which classic methods include spectral masking and spectral mapping. Spectrum masking and spectrum mapping have their advantages and disadvantages in different noise environments, and they are complementary. This paper proposes a dual branch deep interactive UNet (DBDIUNet) for monaural speech enhancement to combine the advantages of spectral mapping and spectral masking. The DBDIUNet uses a classical encoder-decoder architecture, including a shared encoder and two decoders. One decoder outputs the complex ideal ratio mask (cIRM), and the other outputs the enhanced complex spectrum. The two signals are coupled by coherent averaging to get the enhanced speech signal. A novel deep interaction structure is proposed for the interaction of information between the two decoders, which achieves a very significant performance improvement at the minimal cost of computational consumption and hyperparameters. Compared with the noisy speech on the Interspeech 2020 deep noise suppression challenge blind test set, DBDIUNet improves the WB-PESQ, NB-PESQ, STOI, SI-SDR indicators by 1.575, 0.955, 7.9%, 8.67 respectively. In the noisy-reverberant speech enhancement test, DBDIUNet improves the WB-PESQ, STOI, SI-SDR, DNSMOS, and SRMR by 0.98, 10.24%, 5.43, 1.51, 3.43, respectively, which exceeds the state-of-the-art model.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Dual branch deep interactive UNet for monaural noisy-reverberant speech enhancement

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Acoustics

Lead the way for us

Journal: Applied Acoustics	Publication Date: Aug 17, 2023
Citations: 5

Similar Papers

Joint waveform and magnitude processing for monaural speech enhancement
Xiaoxiao Xiang ... Xiaojuan Zhang
Applied Acoustics | VOL. 200
Xiaoxiao Xiang, et. al.Xiaoxiao Xiang ... Xiaojuan Zhang
26 Oct 2022
Applied Acoustics | VOL. 200

On Learning Spectral Masking for Single Channel Speech Enhancement Using Feedforward and Recurrent Neural Networks
Nasir Saleem ... Muhammad Irfan Khattak
IEEE Access | VOL. 8
Nasir Saleem, et. al.Nasir Saleem ... Muhammad Irfan Khattak
01 Jan 2020
IEEE Access | VOL. 8

Effective human detection via multi-model classification and adaptive late fusion
Chao Zhu ... Xu-Cheng Yin
International Journal of Wavelets, Multiresolution and Information Processing | VOL. 16
Chao Zhu, et. al.Chao Zhu ... Xu-Cheng Yin
01 Mar 2018
International Journal of Wavelets, Multiresolution and Information Processing | VOL. 16

Speech Enhancement Based on Fusion of Both Magnitude/Phase-Aware Features and Targets
Haitao Lang ... Jie Yang
Electronics | VOL. 9
Haitao Lang, et. al.Haitao Lang ... Jie Yang
10 Jul 2020
Electronics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Dual branch deep interactive UNet for monaural noisy-reverberant speech enhancement

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Acoustics