Abstract

Spoken integers between 0 and 999 (e.g. “three hundred and seventy two”) were recorded by four speakers. For each utterance, a measure of the waveform amplitude (called the envelope) and pitch period estimates were recorded and plotted. It is shown that these two correlates can be used to predict word and syllable boundaries within each spoken integer. These junctural correlates also reflect the semantic structure of the integers, so that both multiplication and addition operations are represented in the spoken number. Each boundary is also assigned a likelihood factor. These junctural determinations are determined without any use of segmental information (such as spectral estimates), and are directly related to the structure of the utterance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call