Abstract

Speaker-independent word recognition is performed, based on a small acoustically distinct vocabulary, with minimal hardware requirements. After a simple preconditioning filter, the zero crossing intervals of the input speech are measured and sorted by duration, to provide a rough measure of the frequency distribution within each input frame. The distribution of zero crossing intervals is transformed into a binary feature vector, which is compared with each reference template using a modified Hamming distance measure. A dynamic time warping algorithm is used to permit recognition of various speaker rates, and to economize on the reference template storage requirements. A mask vector with each reference vector on a template is used to ignore insignificant (or speaker-dependent) features of the words detected.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call