Multiple resolution analysis for robust automatic speech recognition

Roberto Gemello,Franco Mana,Dario Albesano,Renato De Mori

doi:10.1016/j.csl.2004.06.001

Abstract

This paper investigates the potential of exploiting the redundancy implicit in multiple resolution analysis for automatic speech recognition systems. The analysis is performed by a binary tree of elements, each one of which is made by a half-band filter followed by a down sampler which discards odd samples. Filter design and feature computation from samples are discussed and recognition performance with different choices is presented. A paradigm consisting in redundant feature extraction, followed by feature normalization, followed by dimensionality reduction is proposed. Feature normalization is performed by denoising algorithms. Two of them are considered and evaluated, namely, signal-to-noise ratio-dependent spectral subtraction and soft thresholding. Dimensionality reduction is performed with principal component analysis. Experiments using telephone corpora and the Aurora3 corpus are reported. They indicate that the proposed paradigm leads to a recognition performance with clean speech, measured in word error rate, marginally superior to the one obtained with perceptual linear prediction coefficients. Nevertheless, performance of the proposed analysis paradigm is significantly superior when used with noisy data and the same denoising algorithm is applied to all the analysis methods, which are compared.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multiple resolution analysis for robust automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Journal: Computer Speech & Language	Publication Date: Jul 29, 2004
Citations: 15

Similar Papers

Effects of the Dynamic and Energy Based Feature Extraction on Hindi Speech Recognition
Shobha Bhatt ... Anurag Jain
Recent Advances in Computer Science and Communications | VOL. 14
Shobha Bhatt, et. al.Shobha Bhatt ... Anurag Jain
30 Aug 2021
Recent Advances in Computer Science and Communications | VOL. 14

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

2D Psychoacoustic modeling of equivalent masking for automatic speech recognition
Peng Dai ... Huijun Ding
Signal Processing | VOL. 115
Peng Dai, et. al.Peng Dai ... Huijun Ding
19 Mar 2015
Signal Processing | VOL. 115

DeepResGRU: Residual gated recurrent neural network-augmented Kalman filtering for speech enhancement and recognition
Nasir Saleem ... Muhammad Shafi
Knowledge-Based Systems | VOL. 238
Nasir Saleem, et. al.Nasir Saleem ... Muhammad Shafi
11 Dec 2021
Knowledge-Based Systems | VOL. 238

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multiple resolution analysis for robust automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language