Speaker Diarization with LSTM

Quan Wang,Philip Andrew Mansfield,Carlton Downey,Li Wan,Ignacio Lopz Moreno

doi:10.1109/icassp.2018.8462628

Abstract

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications. However, mirroring the rise of deep learning in various domains, neural network based audio embeddings, also known as d-vectors, have consistently demonstrated superior speaker verification performance. In this paper, we build on the success of d-vector based speaker verification systems to develop a new d-vector based approach to speaker diarization. Specifically, we combine LSTM-based d-vector audio embeddings with recent work in nonparametric clustering to obtain a state-of-the-art speaker diarization system. Our system is evaluated on three standard public datasets, suggesting that d-vector based diarization systems offer significant advantages over traditional i-vector based systems. We achieved a 12.0% diarization error rate on NIST SRE 2000 CALLHOME, while our model is trained with out-of-domain data from voice search logs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker Diarization with LSTM

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker diarization and detection system using a priori speaker information
Ouassila Kenai ... Salim Djeghiour
-
Ouassila Kenai, et. al.Ouassila Kenai ... Salim Djeghiour
01 Apr 2018
01 Apr 2018

Making Speaker Diarization System Noise Tolerant
Davit S Karamyan ... Saten A Harutyunyan
Mathematical Problems of Computer Science | VOL. 59
Davit S Karamyan, et. al.Davit S Karamyan ... Saten A Harutyunyan
31 May 2023
Mathematical Problems of Computer Science | VOL. 59

A fast-match approach for robust, faster than real-time speaker diarization
Yan Huang ... Chuck Wooters
-
Yan Huang, et. al. Yan Huang ... Chuck Wooters
01 Jan 2007
01 Jan 2007

Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Jie Wang ... Haodong Zhou
-
Jie Wang, et. al.Jie Wang ... Haodong Zhou
04 Jun 2023
04 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker Diarization with LSTM

Abstract

Talk to us

Similar Papers