Spoken digits recognition using DP matching combined with a subspace decomposition method

Ken Kusakari,Takahiro Murakami,Yoshihisa Ishida,Kurihara Kiyoshi

doi:10.1121/1.4777213

Abstract

In this paper, we propose a method for spoken digits recognition using DP Matching combined with subspace decomposition that linearly separates into phonetic information from speaker information based on principle component analysis [M. Nishida and Y. Ariki, IEICE Trans. Japan J85-D-II, No. 4 (2002)]. This method allows for more robust speech recognition of less standard speech patterns. The use of the spectral envelope by LPC in speech recognition is unable to avoid errors in recognition due to the uncertainty of personalities, the dynamic variation of features, and so on. By using the subspace method, the proposed method eliminates these problems and enables good recognition results of less standard speech patterns. We use DP matching in recognizing, because it allows for more efficient pattern matching by normalizing the length of vowels. Simulation results show that the proposed method, using orthonormal projection to phonetic subspace with less speaker information, is superior to the conventional method using LPC spectra and DP Matching.

Full Text