An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames.

Giorgio Biagetti,Paolo Crippa,Claudio Turchetti,Laura Falaschetti,Simone Orcioni

doi:10.1109/tcyb.2016.2603146

Abstract

Speaker identification plays a crucial role in biometric person identification as systems based on human speech are increasingly used for the recognition of people. Mel frequency cepstral coefficients (MFCCs) have been widely adopted for decades in speech processing to capture the speech-specific characteristics with a reduced dimensionality. However, although their ability to decorrelate the vocal source and the vocal tract filter make them suitable for speech recognition, they greatly mitigate the speaker variability, a specific characteristic that distinguishes different speakers. This paper presents a theoretical framework and an experimental evaluation showing that reducing the dimension of features by applying the discrete Karhunen-Loève transform (DKLT) to the log-spectrum of the speech signal guarantees better performance compared to conventional MFCC features. In particular with short sequences of speech frames, with typical duration of less than 2 s, the performance of truncated DKLT representation achieved for the identification of five speakers are always better than those achieved with the MFCCs for the experiments we performed. Additionally, the framework was tested on up to 100 TIMIT speakers with sequences of less than 3.5 s showing very good recognition capabilities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Cybernetics

Lead the way for us

Journal: IEEE Transactions on Cybernetics	Publication Date: Sep 19, 2016
Citations: 57

Similar Papers

Speaker Identification with Short Sequences of Speech Frames
Alessandro Curzi ... Simone Orcioni
-
Alessandro Curzi, et. al.Alessandro Curzi ... Simone Orcioni
01 Jan 2015
01 Jan 2015

Non-linear filtering for feature enhancement of reverberant speech
Amit Kumar Verma ... Hemendra Tomar
-
Amit Kumar Verma, et. al.Amit Kumar Verma ... Hemendra Tomar
01 Nov 2017
01 Nov 2017

Significance of analytic phase of speech signals in speaker verification
Karthika Vijayan ... K Sri Rama Murty
Speech Communication | VOL. 81
Karthika Vijayan, et. al.Karthika Vijayan ... K Sri Rama Murty
26 Feb 2016
Speech Communication | VOL. 81

Text-Independent Speaker Identification by Combining MFCC and MVA Features
Mohamed Cherif Amara Korba ... Djemili Rafik
-
Mohamed Cherif Amara Korba, et. al.Mohamed Cherif Amara Korba ... Djemili Rafik
01 Nov 2018
01 Nov 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Cybernetics