A new approach to the design of IWSR systems is proposed in this paper. This involves a dynamic matching strategy based on the nature of the input speech segment. This is called signal-dependent matching. The computational complexity in the implementation of the proposed algorithm is significantly reduced by adopting a two stage approach in matching. In the first stage, the warping path between the test utterance and a reference utterance is determined. In the second stage, the distance between the utterances is computed along the path. There will be a slight degradation in the performance of a two stage approach as compared to the single stage approach, but this can be tolerated in view of the significant computational advantage. The performance degradation is more than compensated by the signal-dependent matching strategy in the second stage. To measure the improvement in the recognition performance, a new index of performance is defined, that reflects the characteristics of the distance matrix for a given vocabulary, rather than the characteristics of the confusion matrix. The performance of the signal-dependent matching algorithm is significantly better than the standard dynamic time warping matching algorithm for confusable as well as nonconfusable vocabulary. We also develop a signal-dependent matching algorithm, which takes into account some distortions in the input speech. As an example we offer the agorithm twice the same test utterance, once undistorted, once after a distortion. Our research until now indicates a improvement in automatic isolated word speech recognition systems while using signal-dependent parameter measuring and signal dependent matching.
Read full abstract