Abstract
The popular accuracy and substitutionary error measures of performance for speech recognizers fail to account for the distribution of errors among vocabulary items, the potentially different costs of errors, and the difficulties of various recognition tasks. One new performance measure, called the Relative Information Loss (RIL), can account for the distribution of errors, and, when used in conjunction with a rate distortion model, can reflect the costs of individual errors in voice entry systems. When error rate is used as the performance measure, worst-case performances vary with vocabulary size, while, with RIL, the scale from best to worst-case performance is independent of vocabulary size. The confusion matrices from an extension of the Doddington-Schalk [1] tests of commercial speech recognizers were used to determine the RIL of each of 10 tested recognizers. Cost-performance analysis, using a program for rate-distortion analysis, determined for any user-defined limits on the expected costs of errors, which of the recognizers would perform adequately for the tested task conditions.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have