Efficient recognition of continuously-spoken numbers

D O'Shaughnessy,M Gabrea

doi:10.1109/ccece.2001.933735

Abstract

Automatic recognition of continuously-spoken numbers (e.g., telephone or credit card digit sequences) is possible with excellent accuracy, even in applications using telephone lines and serving a large population. However, even such simple recognition tasks suffer decreased performance in adverse conditions, e.g., significant background noise or fading on portable telephone channels. If we further impose significant limitations on the computing resources for the recognition task, then robust efficient speech recognition is still a significant challenge, even for a vocabulary as simple as the digits. Since connected-digit recognition over telephone lines has very practical applications. The amount of computer resources needed for a given level of recognition accuracy is investigated. Rather than use a traditional hidden Markov model approach with cepstral analysis, which is computationally intensive and does not always work well under adverse acoustic conditions, a simpler spectral analysis is used, combined with a segmental approach. The restricted nature of the digit vocabulary allows this simpler approach. High recognition accuracy can be maintained despite a large decrease in both memory and computation.

Full Text