Abstract

The on-off statistics of conversational speech have been investigated using a large database of 50 min of telephone speech. Based on the measurement results of on-off patterns, probability density functions of silence and talkspurt durations are modeled approximately by two weighted geometric functions. Then, for any value of hangover or fill-in, speech parameters such as speech activity, average silence, and talkspurt durations are calculated using the fitted probability density function of silence durations and compared to those measured. Directly measured values of speech parameters and those calculated agree closely for a practical range of hangover and fill-in time. For a large hangover time greater than 200 ms, silences and talkspurts can be fit by an exponential distribution and a constant-plus-exponential distribution, respectively. On the other hand, for a large fill-in time greater than 200 ms, silences and talkspurts can be modeled by a constant-plus-exponential distribution and an exponential distribution, respectively. With both large hangover and fill-in values, the talkspurt model agrees closely with the measured data, but the silence model does not agree as closely as the talkspurt model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call