The Nonlinear Time Alignment Model for Speech Recognition System

D D Doye,T R Sontakke,Smita Nagtode

doi:10.1080/03772063.2001.11416239

Abstract

We present the new nonlinear time alignment model, which is much faster than widely accepted DTW algorithms. This work has been started with the aim of finding suitable time alignment algorithm and features, for Marathi (Language spoken in the state of Maharashtra, India) word Speech Recognition System. Proposed algorithm shows comparable or better recognition efficiency than widely accepted algorithms and is robust to end point variations. In this work, vocabularies are: (1) 46 isolated monosyllabic confusing Marathi alphabets and (2) 46 non-confusing names of the persons. The features used are LPC, LFCC and MFCC. For the confusing word vocabulary, the proposed algorithm proved to be best showing maximum recognition efficiency of 89.13% and second best is Itakura's DTW algorithm with maximum recognition efficiency of 86.96%. LFCC with Itakura's DTW algorithm shows poor performance with maximum recognition efficiency of 13.40%. But LFCC with proposed algorithm shows comparable results for non-confusing vocabulary.

Full Text