Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Rongzhi Gu,Dong Yu,Shi-Xiong Zhang,Yuexian Zou

doi:10.1109/taslp.2022.3229261

Abstract

Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time domain network structure and beamforming also gains significant attention. This study proposes a novel all-neural beamforming method in time domain and makes an attempt to unify the all-neural beamforming pipelines for time domain and frequency domain multichannel speech separation. The proposed model consists of two modules: separation and beamforming. Both modules perform temporal-spectral-spatial modeling and are trained from end-to-end using a joint loss function. The novelty of this study lies in two folds. Firstly, a time domain directional feature conditioned on the direction of the target speaker is proposed, which can be jointly optimized within the time domain architecture to enhance target signal estimation. Secondly, an all-neural beamforming network in time domain is designed to refine the pre-separated results. This module features with parametric time-variant beamforming coefficient estimation, without explicitly following the derivation of optimal filters that may lead to an upper bound. The proposed method is evaluated on simulated reverberant overlapped speech data derived from the AISHELL-1 corpus. Experimental results demonstrate significant performance improvements over frequency domain state-of-the-arts, ideal magnitude masks and existing time domain neural beamforming methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2023
Citations: 14

Similar Papers

Detecting of Barely Visible Impact Damage on Carbon Fiber Reinforced Polymer Using Diffusion Ultrasonic Improved by Time-Frequency Domain Disturbance Sensitive Zone.
Yuqi Ma ... Zhaoyuan Xu
Sensors | VOL. 24
Yuqi Ma, et. al.Yuqi Ma ... Zhaoyuan Xu
17 May 2024
Sensors | VOL. 24

Simulation analysis of time domain and frequency domain test methods of electromagnetic shielding effectiveness of materials
Xiaofeng Hu ... Long Zhang
-
Xiaofeng Hu, et. al. Xiaofeng Hu ... Long Zhang
01 Jul 2011
01 Jul 2011

Time and frequency domain methods for quantifying common modulation of motor unit firing patterns
Lance J Myers ... Madeleine M Lowery
Journal of NeuroEngineering and Rehabilitation | VOL. 1
Lance J Myers, et. al.Lance J Myers ... Madeleine M Lowery
01 Jan 2004
Journal of NeuroEngineering and Rehabilitation | VOL. 1

Fourier Transform and Superposition of Sinusoidal Functions
Mikio Tohyama
-
Mikio TohyamaMikio Tohyama
02 Nov 2017
02 Nov 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing