Real-Time Binaural Speech Separation with Preserved Spatial Cues

Cong Han,Yi Luo,Nima Mesgarani

doi:10.1109/icassp40776.2020.9053215

Abstract

Deep learning speech separation algorithms have achieved great success in improving the quality and intelligibility of separated speech from mixed audio. Most previous methods focused on generating a single-channel output for each of the target speakers, hence discarding the spatial cues needed for the localization of sound sources in space. However, preserving the spatial information is important in many applications that aim to accurately render the acoustic scene such as in hearing aids and augmented reality (AR). Here, we propose a speech separation algorithm that preserves the interaural cues of separated sound sources and can be implemented with low latency and high fidelity, therefore enabling a real-time modification of the acoustic scene. Based on the time-domain audio separation network (TasNet), a single-channel time-domain speech separation system that can be implemented in real-time, we propose a multi-input-multi-output (MIMO) end-to-end extension of TasNet that takes binaural mixed audio as input and simultaneously separates target speakers in both channels. Experimental results show that the proposed end-to-end MIMO system is able to significantly improve the separation performance and keep the perceived location of the modified sources intact in various acoustic scenes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Real-Time Binaural Speech Separation with Preserved Spatial Cues

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Research on Speech Separation and Recognition Algorithm Based on Deep Learning
Sarah Wan
-
Sarah WanSarah Wan
29 Jul 2021
29 Jul 2021

Enhancing the energy efficiency of wireless-communicated binaural hearing aids for speech separation driven by soft-computing algorithms
R Gil-Pita ... M Rosa-Zurera
Applied Soft Computing Journal | VOL. 12
R Gil-Pita, et. al.R Gil-Pita ... M Rosa-Zurera
31 Mar 2011
Applied Soft Computing Journal | VOL. 12

Speech Separation Algorithm Using Gated Recurrent Network Based on Microphone Array
Xiaoyan Zhao ... Ying Tong
Intelligent Automation & Soft Computing | VOL. 36
Xiaoyan Zhao, et. al.Xiaoyan Zhao ... Ying Tong
01 Jan 2023
Intelligent Automation & Soft Computing | VOL. 36

Position-dependent hearing in three species of bushcrickets (Tettigoniidae, Orthoptera).
Reinhard Lakes-Harlan ... Jan Scherberich
Royal Society open science | VOL. 2
Reinhard Lakes-Harlan, et. al.Reinhard Lakes-Harlan ... Jan Scherberich
01 Jun 2015
Royal Society open science | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Real-Time Binaural Speech Separation with Preserved Spatial Cues

Abstract

Talk to us

Similar Papers