Large-Scale Speaker Diarization for Long Recordings and Small Collections

Marijn Huijbregts,David A Van Leeuwen

doi:10.1109/tasl.2011.2162320

Marijn Huijbregts, David A Van Leeuwen

Open Access

https://doi.org/10.1109/tasl.2011.2162320

Copy DOI

Abstract

Performing speaker diarization of very long recordings is a problem for most diarization systems that are based on agglomerative clustering with an hidden Markov model (HMM) topology. Performing collection-wide speaker diarization, where each speaker is identified uniquely across the entire collection, is even a more challenging task. In this paper we propose a method with which it is possible to efficiently perform diarization of long recordings. We have also applied this method successfully to a collection of a total duration of approximately 15 hours. The method consists of first segmenting long recordings into smaller chunks on which diarization is performed. Next, a speaker detection system is used to link the speech clusters from each chunk and to assign a unique label to each speaker in the long recording or in the small collection. We show for three different audio collections that it is possible to perform high-quality diarization with this approach. The long meetings from the ICSI corpus are processed 5.5 times faster than the originally needed time and by uniquely labeling each speaker across the entire collection it becomes possible to perform speaker-based information retrieval with high accuracy (mean average precision of 0.57).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Feb 1, 2012
Citations: 52	License type: mit

R Discovery Prime

R Discovery Prime

Large-Scale Speaker Diarization for Long Recordings and Small Collections

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Similar Papers

Automatic social role recognition and its application in structuring multiparty interactions

-

01 Jan 2015
01 Jan 2015

Making Speaker Diarization System Noise Tolerant
Davit S Karamyan ... Saten A Harutyunyan
Mathematical Problems of Computer Science | VOL. 59
Davit S Karamyan, et. al.Davit S Karamyan ... Saten A Harutyunyan
31 May 2023
Mathematical Problems of Computer Science | VOL. 59

Speaker diarization and detection system using a priori speaker information
Ouassila Kenai ... Salim Djeghiour
-
Ouassila Kenai, et. al.Ouassila Kenai ... Salim Djeghiour
01 Apr 2018
01 Apr 2018

Speech and multilingual natural language framework for speaker change detection and diarization
Or Haim Anidjar ... Itshak Lapidot
Expert Systems With Applications | VOL. 213
Or Haim Anidjar, et. al.Or Haim Anidjar ... Itshak Lapidot
11 Nov 2022
Expert Systems With Applications | VOL. 213

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Large-Scale Speaker Diarization for Long Recordings and Small Collections

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing