Αναγνώριση ομιλητή και ομιλίας με χρήση κυματιδίων

Μιχάλης Σιαφαρίκας

doi:10.12681/eadd/26434

Abstract

The main goal of the present thesis is the exploitation of wavelets for the optimization of speaker and speech recognition systems performance. In this context, four new speech parameterization methods are introduced: (1) The first method adapts the frequency resolution of wavelet packet transform to the critical bandwidth of auditory filters incorporating the recent advances for their estimation. (2) The second method introduces a generalization of wavelet packet transform, named overlapping wavelet packet transform, which emphasizes those frequency sub-bands that critical bandwidth changes from a finer to a coarser value. (3) The third method evaluates the contribution of each one of eight non-overlapping frequency sub-bands, that the Nyquist interval is divided, to the speaker recognition task and a wavelet packet transform is constructed which adapts its frequency resolution according to the performance of each sub-band. (4) The fourth method introduces a new technique for seeking and selecting the best basis among all wavelet packet transforms available in the speaker recognition task taking as criterion the EER. The aforementioned four speech signal parameterizations were evaluated on the speaker verification system WCL-1 of Wire Communications Laboratory, University of Patras, utilizing the speaker recognition corpora POLYCOST and NIST and their superiority was proven over previous wavelet-based parameterizations as well as the widely used Mel Frequency Cepstral Coefficients (MFCC). Among the four proposed methods, it was proven that the second parameterization technique exhibited the best performance. Furthermore, the most important wavelet properties are thoroughly analyzed, the optimal is selected for the representation of the speech signal and this choice is experimentally verified. Finally, the first two parameterization methods were further modified and extended appropriately for application on the speech recognition task where their superiority was proven over traditionally and widely used speech parameterization techniques based on Fourier transform. The main conclusion that resulted in the present doctoral thesis is that wavelets and specifically wavelet packet transforms can be used successfully for the tasks of speaker and speech recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Αναγνώριση ομιλητή και ομιλίας με χρήση κυματιδίων

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Contemporary Methods for Speech Parameterization
Todor Ganchev
-
Todor GanchevTodor Ganchev
01 Jan 2010
01 Jan 2010

Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks
Gurpreet Kaur ... Mohit Srivastava
Journal of Telecommunications and Information Technology | VOL. 2
Gurpreet Kaur, et. al.Gurpreet Kaur ... Mohit Srivastava
29 Jun 2018
Journal of Telecommunications and Information Technology | VOL. 2

Classical and Deep Learning Data Processing Techniques for Speech and Speaker Recognitions
Aakshi Mittal ... Mohit Dua
-
Aakshi Mittal, et. al.Aakshi Mittal ... Mohit Dua
01 Jan 2020
01 Jan 2020

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition.
Sridhar Krishna Nemala ... Kailash Patil
International Journal of Speech Technology | VOL. 16
Sridhar Krishna Nemala, et. al.Sridhar Krishna Nemala ... Kailash Patil
18 Dec 2012
International Journal of Speech Technology | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Αναγνώριση ομιλητή και ομιλίας με χρήση κυματιδίων

Abstract

Talk to us

Similar Papers