AbstractHandling background noise or echo (reverberation) etc. is very important for having an automated robot etc. recognize remote speech in a real environment. As effective schemes for handling this problem, noise reducing schemes such as model adaptation schemes including HMM decomposition and composition or microphone array (beamformer) signal processing, spectral subtraction, etc. have been proposed. In particular, a model adaptation scheme is very effective for speech recognition in a noisy environment and its recognition performance increases in proportion to the signal‐to‐noise ratio (SNR). In this paper, improving the recognition performance in a low‐SNR environment by receiving speech at a high SNR using a microphone array before HMM decomposition and composition is attempted. The results of speech recognition experiments conducted in a noisy environment in an acoustic laboratory show an improvement in the recognition rate of about 25% by the proposed method for the case in which the SNR in a single microphone is 0 dB, as compared with the cases of using microphone array signal processing, HMM decomposition and composition alone. In addition, the proposed method shows recognition performance comparable to the case of using cepstrum mean normalization and spectral subtraction performed with an optimal coefficient given to the speech after microphone array processing. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 85(9): 13–22, 2002; Published online in Wiley InterScience (www.interscience. wiley.com). DOI 10.1002/ecjb.10068
Read full abstract