A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization

Mohammed Senoussaoui,Pierre Dumouchel,Patrick Kenny,Themos Stafylakis

doi:10.1109/taslp.2013.2285474

A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization

Mohammed Senoussaoui, Pierre Dumouchel + Show 2 more

https://doi.org/10.1109/taslp.2013.2285474

Copy DOI

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2014
Citations: 148

Affiliation: École de Technologie Supérieure, Computer Research Institute of Montréal

#Telephone Speech #Diarization Error Rate + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

Speaker clustering is a crucial step for speaker diarization. The short duration of speech segments in telephone speech dialogue and the absence of prior information on the number of clusters dramatically increase the difficulty of this problem in diarizing spontaneous telephone speech conversations. We propose a simple iterative Mean Shift algorithm based on the cosine distance to perform speaker clustering under these conditions. Two variants of the cosine distance Mean Shift are compared in an exhaustive practical study. We report state of the art results as measured by the Diarization Error Rate and the Number of Detected Speakers on the LDC CallHome telephone corpus.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.