A Joint Approach for Single-Channel Speaker Identification and Speech Separation

Pejman Mowlaee,Zheng-Hua Tan,Søren Holdt Jensen,Mads Græsbøll Christensen,Tomi Kinnunen,Pasi Franti,Rahim Saeidi

doi:10.1109/tasl.2012.2208627

Abstract

In this paper, we present a novel system for joint speaker identification and speech separation. For speaker identification a single-channel speaker identification algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as a by-product. For speech separation, we propose a sinusoidal model-based algorithm. The speech separation algorithm consists of a double-talk/single-talk detector followed by a minimum mean square error estimator of sinusoidal parameters for finding optimal codevectors from pre-trained speaker codebooks. In evaluating the proposed system, we start from a situation where we have prior information of codebook indices, speaker identities and SSR-level, and then, by relaxing these assumptions one by one, we demonstrate the efficiency of the proposed fully blind system. In contrast to previous studies that mostly focus on automatic speech recognition (ASR) accuracy, here, we report the objective and subjective results as well. The results show that the proposed system performs as well as the best of the state-of-the-art in terms of perceived quality while its performance in terms of speaker identification and automatic speech recognition results are generally lower. It outperforms the state-of-the-art in terms of intelligibility showing that the ASR results are not conclusive. The proposed method achieves on average, 52.3% ASR accuracy, 41.2 points in MUSHRA and 85.9% in speech intelligibility.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Nov 1, 2012
Citations: 91	License type: other-oa

R Discovery Prime

R Discovery Prime

A Joint Approach for Single-Channel Speaker Identification and Speech Separation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Similar Papers

Strategies for improving audible quality and speech recognition accuracy of reverberant speech
B.W Gillespie ... A.E Atlas
-
B.W Gillespie, et. al.B.W Gillespie ... A.E Atlas
06 Apr 2003
06 Apr 2003

Evaluation of Spoken Dialogue System that uses Utterance Timing to Interpret User Utterances
Kazunori Komatani ... Hiroshi G. Okuno
-
Kazunori Komatani, et. al.Kazunori Komatani ... Hiroshi G. Okuno
01 Jan 2010
01 Jan 2010

Analyzing temporal transition of real user's behaviors in a spoken dialogue system
Kazunori Komatani ... Hiroshi G Okuno
-
Kazunori Komatani, et. al.Kazunori Komatani ... Hiroshi G Okuno
27 Aug 2007
27 Aug 2007

Multichannel Wiener Filter with Early Reflection Raking for Automatic Speech Recognition in Presence of Reverberation
Konrad Kowalczyk
-
Konrad KowalczykKonrad Kowalczyk
01 Sep 2019
01 Sep 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Joint Approach for Single-Channel Speaker Identification and Speech Separation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing