Continuous speech recognition method

Stephen L Moshier

doi:10.1121/1.391522

Abstract

A speech recognition method for detecting and recognizing one or more keywords in a continuous audio signal is disclosed. Each keyword is represented by a keyword template representing one or more target patterns, and each target pattern comprises statistics of each of at least one spectrum selected from plural short-term spectra generated according to a predetermined system for processing of the incoming audio. The spectra are processed by a frequency equalization and normalizing method to enhance the separation between the spectral pattern classes during later analysis. The processed audio spectra are grouped into spectral patterns, are transformed to reduce dimensionality of the patterns, and are compared by means of likelihood statistics with the target patterns of the keyword templates. A concatenation technique employing a loosely set detection threshold makes it very unlikely that a correct pattern will be rejected.

Full Text