TFCnet: Time-Frequency Domain Corrector for Speech Separation

Weinan Tong,Helen Meng,Jiaxu Zhu,Jun Chen,Shiyin Kang,Zhiyong Wu

doi:10.1109/icassp49357.2023.10096785

Abstract

Deep learning-based methods have made significant achievements in speech separation. Especially the time-domain separation methods have achieved the best performance in recent years. However, time-domain methods are unstable for waveform transformation, which is prone to amplitude and phase errors. Considering the robustness of time-frequency (T-F) domain methods, we propose an innovative network architecture called Time-Frequency Domain Corrector Network (TFCNet), which consists of a time-domain separator and a specially-designed T-F domain corrector. The corrector module is added after the time-domain separation step to correct the real and imaginary parts information in the T-F domain. The proposed model achieves state-of-the-art performance with an SI-SDRi of 22.2dB on the WSJ0-2mix dataset and an SI-SDRi of 19.4dB on the Libri-2mix dataset.

Full Text