After years of productive research, speech synthesis is now profitably automating services by answering queries via constrained dialogs, directly accessing individual computer databases, and speaking text created from disparate sources of information. Directory-based services, such as Automated Customer Name and Address (ACNA), requires synthesis with high intelligibility and name pronunciation accuracy. Current synthesis technology achieves those goals. However, even the best of current speech technology is not good enough to mindlessly “drop” into complex services. Customized directory preprocessing is still necessary to transform listing data, which commonly contains unconventional abbreviations, unlabeled acronyms, and scrambled word ordering, into a sentence suitable for synthesis. This article describes state-of-the-art directory preprocessing programs that have led to successful implementations for synthesis services in 2 major U.S. telephone companies (Ameritech and Bell Atlantic). Of course, the basic capabilities of the synthesizer, such as pronunciation accuracy, speech quality and naturalness, play a large role. Efforts ensured locality terms were pronounced in accordance with local custom. Finally, for prompts and other fixed messages, this article describes experiments that determined whether the naturalness of recorded speech offsets the undesirable discontinuity between recorded and synthesized utterances.
Read full abstract