Employment of Spectral Voicing Information for Speech and Speaker Recognition in Noisy Conditions

Peter Janovi,Mnevver Kker

doi:10.5772/6371

Abstract

In this chapter, we describe our recent advances on representation and modelling of speech signals for automatic speech and speaker recognition in noisy conditions. The research is motivated by the need for improvements in these research areas in order the automatic speech and speaker recognition systems could be fully employed in real-world applications which operate often in noisy conditions. Speech sounds are produced by passing a source-signal through a vocal-tract filter, i.e., different speech sounds are produced when a given vocal-tract filter is excited by different source-signals. In spite of this, the speech representation and modelling in current speech and speaker recognition systems typically include only the information about the vocal-tract filter, which is obtained by estimating the envelope of short-term spectra. The information about the source-signal used in producing speech may be characterised by a voicing character of a speech frame or individual frequency bands and the value of the fundamental frequency (F0). This chapter presents our recent research on estimation of the voicing information of speech spectra in the presence of noise and employment of this information into speech modelling and in missing-feature-based speech/speaker recognition system to improve noise robustness. The chapter is split into three parts. The first part of the chapter introduces a novel method for estimation of the voicing information of speech spectrum. There have been several methods previously proposed to this problem. In (Griffin & Lim, 1988), the estimation is performed based on the closeness of fit between the original and synthetic spectrum representing harmonics of the fundamental frequency (F0). A similar measure is also used in (McAulay & Quatieri, 1990) to estimate the maximum frequency considered as voiced. In (McCree & Barnwell, 1995), the voicing information of a frequency region was estimated based on the normalised correlation of the time-domain signal around the F0 lag. The author in (Stylianou, 2001) estimates the voicing information of each spectral peak by using a procedure based on a comparison of magnitude values at spectral peaks within the F0 frequency range around the considered peak. The estimation of voicing information was not the primary aim of the above methods, and as such, no performance evaluation was provided. Moreover, the above methods did not consider speech corrupted by noise and required an estimation of the F0, which may be difficult to estimate accurately in noisy speech. Here, the presented method for estimation of O pe n A cc es s D at ab as e w w w .in te ch w eb .o rg

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Employment of Spectral Voicing Information for Speech and Speaker Recognition in Noisy Conditions

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Nov 1, 2008
Citations: 12	License type: cc-by-nc-sa

Similar Papers

Speaker characteristics in speech and speaker recognition
M Wagner
-
M WagnerM Wagner
01 Jan 1997
01 Jan 1997

Boosting Localized Features for Speaker and Speech Recognition

-

01 Jan 2010
01 Jan 2010

Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks
Gurpreet Kaur ... Mohit Srivastava
Journal of Telecommunications and Information Technology | VOL. 2
Gurpreet Kaur, et. al.Gurpreet Kaur ... Mohit Srivastava
29 Jun 2018
Journal of Telecommunications and Information Technology | VOL. 2

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
21 Nov 2018
21 Nov 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Employment of Spectral Voicing Information for Speech and Speaker Recognition in Noisy Conditions

Abstract

Talk to us

Similar Papers