Abstract
Speech recognition applications are known to require a significant amount of resources (training data, memory, computing power). However, the targeted context of this work - mobile phone embedded speech recognition system - only authorizes few KB of memory, few MIPS and usually small amount of training data. In order to fit the resource constraints, an approach based on a semi-continuous HMM system using a GMM-based stateindependent acoustic modeling is proposed in this paper. A transformation is computed and applied to the global GMM in order to obtain each of the HMM state-dependent probability density functions. This strategy aims at storing only the transformation function parameters for each state and authorizes to decrease the amount of computing power needed for the likelihood computation. The proposed approach is evaluated on two tasks: a digit recognition task using the French corpus BDSON (which allows a Digit Error Rate of 2.5%) and a voice command task using French corpus VODIS (the Command Error Rate leads around 4.1%). Index Terms: embedded speech recognition, acoustic modeling.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have