Abstract

This paper reviews factors contributing to the quality of digitized and synthesized speech. Digital recording and playback of speech generally produces highly intelligible and natural-sounding speech provided that the sampling rate and quantization levels are adequate. Among the design factors important for the quality of synthesized speech are (a) the accuracy and sophistication of text-to-phonetic-code conversion algorithms and (b) the type of digital data (LPC or formant data) and the attention paid to spectral transitions across phonemic boundaries, the latter being, in part, a function of the unit of speech employed for synthesis (phoneme or diphone). To maximize the intelligibility and comprehension of synthesized speech, (a) single word responses should be avoided in favor of phrases and sentences, (b) discourse should be kept relatively simple with fewer and clearly stated propositions, (c) the rate of presentation should be slower than normal, (d) listeners should be exposed to and trained to discriminate synthesized speech, and (e) noise and other distractions should be kept to a minimum.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call