Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

Wenzhuo Chen,Runxiang Yu,Zhongfu Ye

doi:10.1016/j.apacoust.2023.109839

Abstract

Monaural speech enhancement aims to remove background noise from noisy speech signals captured by a single microphone. In recent years, several cross-domain monaural speech enhancement methods are developed to leverage both waveform and harmonic information. However, these methods fall short in fully capturing the dependencies between the time domain and time-frequency (T-F) domain, as well as in harnessing the benefits of the target decoupling strategy. This paper proposes a causal encoder-decoder-based Triple-branch Cross-domain Fusion Network (TCF-Net), which effectively processes speeches by leveraging both time domain and T-F domain features. The proposed approach enables the parallel recovery of magnitude and phase information to alleviate the compensation problem between them. TCF-Net forms a triple-branch network by collaboratively reconstructing the enhanced spectrum with a complex spectrum branch and a magnitude spectrum branch, while incorporating time-domain information with a waveform compensation branch. To fully leverage the information from three domains, Triple-domain Fusion Modules (TFMs) are inserted in each intermediate layer of the model to extract and merge the information from two T-F domain branches and one time domain branch. The TFMs generate masks to progressively compensate for the magnitude of the two T-F domain branches and promote information interaction, further restoring the magnitude of the clean speech. Experimental results demonstrate that TCF-Net outperforms state-of-the-art (SOTA) cross-domain methods and target decoupling methods under causal configuration in all evaluation metrics, which validates the importance of the proposed cross-domain information fusion strategy and target decoupling strategy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

Abstract

Talk to us

Similar Papers

More From: Applied Acoustics

Lead the way for us

Journal: Applied Acoustics	Publication Date: Jan 5, 2024
Citations: 2

Similar Papers

A novel target decoupling framework based on waveform-spectrum fusion network for monaural speech enhancement
Runxiang Yu ... Zhongfu Ye
Digital Signal Processing | VOL. 141
Runxiang Yu, et. al.Runxiang Yu ... Zhongfu Ye
13 Jul 2023
Digital Signal Processing | VOL. 141

Detecting of Barely Visible Impact Damage on Carbon Fiber Reinforced Polymer Using Diffusion Ultrasonic Improved by Time-Frequency Domain Disturbance Sensitive Zone.
Yuqi Ma ... Zhaoting Liu
Sensors | VOL. 24
Yuqi Ma, et. al.Yuqi Ma ... Zhaoting Liu
17 May 2024
Sensors | VOL. 24

An LDoS attack detection method based on FSWT time–frequency distribution
Xiaocai Wang ... Yufeng Liu
Expert Systems With Applications | VOL. 256
Xiaocai Wang, et. al.Xiaocai Wang ... Yufeng Liu
06 Aug 2024
Expert Systems With Applications | VOL. 256

D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Shengkui Zhao ... Bin Ma
-
Shengkui Zhao, et. al.Shengkui Zhao ... Bin Ma
04 Jun 2023
04 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

Abstract

Talk to us

Similar Papers

More From: Applied Acoustics