Acoustic model adaptation for coded speech using synthetic speech

Koji Tanaka,Satoru Tsuge,Shingo Kuroiwa,Fuji Ren

doi:10.21437/interspeech.2004-17

Abstract

In this paper, we describe a novel acoustic model adaptation technique which generates “speaker-independent” HMM for the target environment. Recently, personal digital assistants like cellular phones are shifting to IP terminals. The encoding-decoding process utilized for transmitting over IP networks deteriorates the quality of speech data. This deterioration causes degradation in speech recognition performance. Acoustic model adaptations can improve recognition performance. However, the conventional adaptation methods usually require a large amount of adaptation data. The proposed method uses HMM-based speech synthesis to generate adaptation data from the acoustic model of HMM-based speech recognizer, and consequently does not require any speech data for adaptation. Experimental results on G.723.1 coded speech recognition show that the proposed method improves speech recognition performance. A relative word error rate reduction of approximately 12% was observed.

Full Text