Abstract

To contribute to the naturalness criteria of speech synthesis, acceptability of changes in segment duration has been investigated. Previous studies showed context dependency of the acceptability evaluation such as intraphrase positional effect, where listeners were more sensitive to the phrase-initial segment duration than the phrase-final one. Such contextual effects were independent of the original durations of the segments tested [Kato et al., J. Acoust. Soc. Am. 104, 540–549 (1998)]. However, past studies used only normal-speed speech and temporal variation was limited. The current study, therefore, examined the contextual effect with a wide variety of speaking rates. The materials were three-mora phrases with either rising or falling accent that were spoken at three rates (fast, normal, and slow) with or without a carrier sentence. The duration of each vowel was either lengthened or shortened (10–50 ms) and listeners evaluated the acceptability of these changes. The results showed a clear speaking-rate effect in parallel with the intraphrase positional effect: the acceptability declined more rapidly as the speaking rate became faster. These results, along with those of Kato et al., suggest that acceptability is evaluated based on the speaking rate rather than on the original duration itself. [Work supported by TAO, Japan.] a)Currently at GITI, Waseda University.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call