Segmenting multiple concurrent speakers using microphone arrays

Guillaume Lathoud,Darren C. Moore,Iain A. McCowan

doi:10.21437/eurospeech.2003-47

Abstract

Speaker turn detection is an important task for many speech processing applications. However, accurate segmentation can be hard to achieve if there are multiple concurrent speakers (overlap), as is typically the case in multi-party conversations. In such cases, the location of the speaker, as measured using a microphone array, may provide greater discrimination than traditional spectral features. This was verified in previous work which obtained a global segmentation in terms of single speaker classes, as well as possible overlap combinations. However, such a global strategy suffers from an explosion of the number of overlap classes, as each possible combination of concurrent speakers must be modeled explicitly. In this paper, we propose two alternative schemes that produce an individual segmentation decision for each speaker, implicitly handling all overlapping speaker combinations. The proposed approaches also allow straightforward online implementations. Experiments are presented comparing the segmentation with that obtained using the previous system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Segmenting multiple concurrent speakers using microphone arrays

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Auditory inspired methods for localization of multiple concurrent speakers
Tania Habib ... Harald Romsdorfer
Computer Speech & Language | VOL. 27
Tania Habib, et. al.Tania Habib ... Harald Romsdorfer
25 Sep 2012
Computer Speech & Language | VOL. 27

Multiple concurrent speaker short-term tracking using a Kalman filter bank
Youssef Oualil ... Dietrich Klakow
-
Youssef Oualil, et. al.Youssef Oualil ... Dietrich Klakow
01 May 2014
01 May 2014

A Sector-Based, Frequency-Domain Approach to Detection and Localization of Multiple Speakers
G Lathoud ... M Magimai-Doss
-
G Lathoud, et. al.G Lathoud ... M Magimai-Doss
18 Mar 2005
18 Mar 2005

MIRNet: Learning Multiple Identities Representations in Overlapped Speech
Hyewon Han ... Soo-Whan Chung
-
Hyewon Han, et. al.Hyewon Han ... Soo-Whan Chung
25 Oct 2020
25 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Segmenting multiple concurrent speakers using microphone arrays

Abstract

Talk to us

Similar Papers