A time-synchronous histogram equalization for noise robust speech recognition

Fumiya Takahashi,Tetsuo Kosaka,Masaharu Kato

doi:10.1121/1.4798785

Abstract

The histogram equation (HEQ) technique is commonly adopted for feature space normalization in speech recognition systems. In this technique, a transform function is calculated directly from the histograms of both training and test data, and the nonlinear effects of additive noise are compensated. In order to estimate the transform function accurately, a certain amount of data are required. However, this is not suitable for real-time application because at least several seconds of evaluation data need to be accumulated before the transform function can be calculated. This means that the system cannot start the recognition process until the end of utterance. In this research, we aim to develop a new speech recognition method based on the HEQ technique for real-time processing. This method is called "time-synchronous frame-weighted HEQ (ts-FHEQ)." In the time-synchronous decoding, lack of data for estimating the histogram becomes a major problem. To resolve this problem, we introduce a frame weighting approach, where the degree of transform is controlled according to the number of data frames. Our speech recognition experiments verified that the proposed technique shows good performance and achieves substantial reduction of calculation time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A time-synchronous histogram equalization for noise robust speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A time-synchronous histogram equalization for noise robust speech recognition
Fumiya Takahashi ... Tetsuo Kosaka
The Journal of the Acoustical Society of America | VOL. 133
Fumiya Takahashi, et. al.Fumiya Takahashi ... Tetsuo Kosaka
01 May 2013
The Journal of the Acoustical Society of America | VOL. 133

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

An FPGA-Based Embedded Robust Speech Recognition System Designed by Combining Empirical Mode Decomposition and a Genetic Algorithm
Shing-Tai Pan ... Xu-Yu Li
IEEE Transactions on Instrumentation and Measurement | VOL. 61
Shing-Tai Pan, et. al.Shing-Tai Pan ... Xu-Yu Li
01 Sep 2012
IEEE Transactions on Instrumentation and Measurement | VOL. 61

End-to-End Speech Endpoint Detection Utilizing Acoustic and Language Modeling Knowledge for Online Low-Latency Speech Recognition
Inyoung Hwang ... Joon-Hyuk Chang
IEEE Access | VOL. 8
Inyoung Hwang, et. al.Inyoung Hwang ... Joon-Hyuk Chang
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A time-synchronous histogram equalization for noise robust speech recognition

Abstract

Talk to us

Similar Papers