Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

N. Becerra,C. Garreton,F. Huenup,C. Molina

doi:10.5772/4751

N. Becerra, C. Garreton + Show 2 more

Open Access

https://doi.org/10.5772/4751

Copy DOI

Export

Save

Cite

Publication Date: Jun 1, 2007
Citations: 1	License type: cc-by-nc-sa

Abstract
Full-Text
Similar Papers

Abstract

Listen

Robustness to noise and low-bit rate coding distortion is one of the main problems faced by automatic speech recognition (ASR) and speaker verification (SV) systems in real applications. Usually, ASR and SV models are trained with speech signals recorded in conditions that are different from testing environments. This mismatch between training and testing can lead to unacceptable error rates. Noise and low-bit rate coding distortion are probably the most important sources of this mismatch. Noise can be classified into additive or convolutional if it corresponds, respectively, to an additive process in the linear domain or to the insertion of a linear transmission channel function. On the other hand, low-bit rate coding distortion is produced by coding – decoding schemes employed in cellular systems and VoIP/ToIP. A popular approach to tackle these problems attempts to estimate the original speech signal before the distortion is introduced. However, the original signal cannot be recovered with 100% accuracy and there will be always an uncertainty in noise canceling. Due to its simplicity, spectral subtraction (SS) (Berouti et al., 1979; Vaseghi & Milner, 1997) has widely been used to reduce the effect of additive noise in speaker recognition (Barger & Sridharan, 1997; Drygajlo & El-Maliki, 1998; Ortega & Gonzalez, 1997), despite the fact that SS loses accuracy at low segmental SNR. Parallel Model Combination (PMC) (Gales & Young,1993) was applied under noisy conditions in (Rose et. al.,1994) where high improvements with additive noise were reported. Nevertheless, PMC requires an accurate knowledge about the additive corrupting signal, whose model is estimated using appreciable amounts of noise data which in turn imposes restrictions on noise stationarity, and about the convolutional distortion that needs to be estimated a priori (Gales, 1997). Rasta filtering (Hermansky et al., 1991) and Cepstral Mean Normalization (CMN) can be very useful to cancel convolutional distortion (Furui, 1982; Reynolds, 1994; van Vuuren, 1996) but, if the speech signal is also corrupted by additive noise, these techniques lose

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

Abstract

Published Version

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Autocorrelation-based Methods for Noise-Robust Speech Recognition
Gholamreza Farahani ... Mohammad Mehdi
-
Gholamreza Farahani, et. al.Gholamreza Farahani ... Mohammad Mehdi
01 Jun 2007
01 Jun 2007

Speech Recognition in Unknown Noisy Conditions
Ji Ming ... Baochun Hou
-
Ji Ming, et. al.Ji Ming ... Baochun Hou
01 Jun 2007
01 Jun 2007

Modulation domain processing and speech phase spectrum in speech enhancement
Yi Zhang
-
Yi ZhangYi Zhang
01 Jan 2012
01 Jan 2012

Multi-channel Feature Enhancement for Robust Speech Recognition
Rudy Rotili ... Emanuele Principi
-
Rudy Rotili, et. al.Rudy Rotili ... Emanuele Principi
23 Jun 2011
23 Jun 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

Abstract

Published Version

Talk to us

Similar Papers