Abstract

The smoothed group delay spectrum distance (SGDS) measure is evaluated in speaker-independent recognition experiments. First, the appropriate level of smoothing of the group delay spectrum (GDS) is investigated by adding noise, etc., to the input speech. Then a comparison with the speaker-dependent case is made. An experiment is reported in which, for low amplitude parts of speech (e.g., unvoiced speech), the standard (LPC) distance measure is used in the interframe distance calculation instead of the SGDS distance measure. This method prevents a loss of recognition accuracy due to too strong an emphasis on certain spectral elements and a consistently high recognition accuracy can be achieved. Finally, evaluate the SGDS distance measure is evaluated where the GDS is represented in the spectral domain as a discrete Fourier transform (DFT) of the LPC coefficients. In comparison to the SGDS which was calculated by weighting the LPC cepstrum co-efficients, computation time and memory space can be reduced without loss of recognition accuracy. Furthermore, a low bit quantization of the GDS is reported and a high recognition rate is achieved with only 32 bits per frame.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call