Permutation invariant training of deep models for speaker-independent multi-talker speech separation

Dong Yu,Zheng-Hua Tan,Morten Kolbaek,Jesper Jensen

doi:10.1109/icassp.2017.7952154

Abstract

We propose a novel deep learning model, which supports permutation invariant training (PIT), for speaker independent multi-talker speech separation, commonly known as the cocktail-party problem. Different from most of the prior arts that treat speech separation as a multi-class regression problem and the deep clustering technique that considers it a segmentation (or clustering) problem, our model optimizes for the separation regression error, ignoring the order of mixing sources. This strategy cleverly solves the long-lasting label permutation problem that has prevented progress on deep learning based techniques for speech separation. Experiments on the equal-energy mixing setup of a Danish corpus confirms the effectiveness of PIT. We believe improvements built upon PIT can eventually solve the cocktail-party problem and enable real-world adoption of, e.g., automatic meeting transcription and multi-party human-computer interaction, where overlapping speech is common.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Permutation invariant training of deep models for speaker-independent multi-talker speech separation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Single-Channel Speech Separation Using Soft-Minimum Permutation Invariant Training
Midia Yousefi ... John H.L Hansen
SSRN Electronic Journal | VOL. -
Midia Yousefi, et. al.Midia Yousefi ... John H.L Hansen
01 Jan 2021
SSRN Electronic Journal | VOL. -

Single-channel speech separation using soft-minimum permutation invariant training
Midia Yousefi ... John H.L Hansen
Speech Communication | VOL. 151
Midia Yousefi, et. al.Midia Yousefi ... John H.L Hansen
18 May 2023
Speech Communication | VOL. 151

Multi-band PIT and Model Integration for Improved Multi-channel Speech Separation
Lianwu Chen ... Meng Yu
-
Lianwu Chen, et. al.Lianwu Chen ... Meng Yu
01 May 2019
01 May 2019

Single-Microphone Speech Enhancement and Separation Using Deep Learning
Morten Kolbæk
-
Morten KolbækMorten Kolbæk
31 Aug 2018
31 Aug 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Permutation invariant training of deep models for speaker-independent multi-talker speech separation

Abstract

Talk to us

Similar Papers