Abstract

An automatic information server to be operated via the telephone net is presented, requiring the input of an old German area code for mail adresses and responding with the new area code. The server both employs speaker-independent recognition of connected words, and a synthesis scheme by concept speaking town names by rules. Recognition is based on an HMM-recognizer, trained Speaker independent. Its feature extraction is insensitive both to different frequency responses of telephone and transmission line, and to slowly varying background noise. Speech synthesis uses PSOLA technique and concatenation of diphones. In order to generate highly natural sounding speech, most utterances are stored as natural speech, with only town names being synthesized. To prevent the impression of two different voices, utterances and diphones were spoken by the same speaker.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.