Unsupervised speaker indexing of discussions using anchor models

Yuya Akita,Tatsuya Kawahara

doi:10.1002/scj.20215

Abstract

AbstractWe present unsupervised speaker indexing, combined with automatic speech recognition (ASR) for speech archives, such as discussions. Our proposed indexing method is based on anchor models, by which we define a feature vector based on the similarity with speakers of a large‐scale speech database. We introduce dimensional normalization and reduction on the vectors to improve discriminant ability. These vectors are then clustered and initial speaker labels are obtained. Using the initial labels, speaker models are constructed for respective clusters and the speakers are finally indexed with the speaker models. We perform ASR using the results of this indexing. We achieved a speaker indexing accuracy of 97% and a significant improvement in the ASR for real discussion data. © 2005 Wiley Periodicals, Inc. Syst Comp Jpn, 36(9): 25–33, 2005; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.20215

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised speaker indexing of discussions using anchor models

Abstract

Talk to us

Similar Papers

More From: Systems and Computers in Japan

Lead the way for us

Journal: Systems and Computers in Japan	Publication Date: Jun 10, 2005
Citations: 2

Similar Papers

Unsupervised speaker indexing using anchor models and automatic transcription of discussions
Yuya Akita ... Tatsuya Kawahara
-
Yuya Akita, et. al.Yuya Akita ... Tatsuya Kawahara
01 Sep 2003
01 Sep 2003

Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
M Nishida ... T Kawahara
IEEE Transactions on Speech and Audio Processing | VOL. 13
M Nishida, et. al.M Nishida ... T Kawahara
01 Jul 2005
IEEE Transactions on Speech and Audio Processing | VOL. 13

Efficient frame-sequential label propagation for video object segmentation
Yadang Chen ... Chuanyan Hao
Multimedia Tools and Applications | VOL. 77
Yadang Chen, et. al.Yadang Chen ... Chuanyan Hao
01 Mar 2017
Multimedia Tools and Applications | VOL. 77

Emotional adaptive training for speaker verification
Fanhu Bie ... Thomas Fang Zheng
-
Fanhu Bie, et. al.Fanhu Bie ... Thomas Fang Zheng
01 Oct 2013
01 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised speaker indexing of discussions using anchor models

Abstract

Talk to us

Similar Papers

More From: Systems and Computers in Japan