Abstract

The input speech is segmented using plural grammar networks, including a network that includes a filler model designed to represent noise or extraneous speech. Recognition processing results in plural lists of candidates, each list containing the N-best candidates generated. The lists are then separately aligned with the dictionary of valid names to generate two lists of valid names. The final recognition pass combines these two lists of names into a dynamic grammar and this dynamic grammar may be used to find the best candidate name using Viterbi recognition. A telephone call routing application based on the recognition system selects the best candidate name corresponding to the name spelled by the user, whether the user pronounces the name prior to spelling, or not.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.