Abstract

This paper presents, in a first part, the detailed results of several field evaluations of the CNET speaker independent speech recognition system in a context of 2 voice-activated servers accessible by the general French public over the telephone. The analysis of roughly 11 000 user's tokens indicates that the rejection of incorrect input is a major problem and that the gap between the recognition rates observed in real use conditions and in the most realistic laboratory tests remains very large.The second part of the paper describes the current improvements of the system: better rejection procedures, enhancement of the recognition performances resulting from both the introduction of field data in the training data and the increase of the number of parameters, automatic adjustments of the HMM topology allowing to either reduce overall model complexity or improve recognition performance. Tested on long distance telephone databases (450 to 750 speakers), the current version of the CNET recognition system yields a laboratory error rate of 0.7% on the 10 French digits and of 0.95% on a 36 word vocabulary.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.