Abstract
The use of demisyllables plus voice‐assimilated phonetic affixes (/− s, − z, − t/, etc.), rather than segments or syllables, as the basic units in speech synthesis would have important advantages [O. Fujimura, J. Acoust. Soc. Am. 59, S55(A) (1976)]. We have tested the practicality of such a model for monosyllabic words of English, using an inventory of demisyllables with a variety of nonaffixed clusters and syllable nuclei. Each initial demisyllable is prepared so as to include a constant length of the transitional portion, and nucleus length variations dependent on the final consonant(s) are automatically included in the “vowel” portion of the final demisyllable. Syllable nucleus “coloration” by postvocalic nasals, /r/, or /l/ is also found to be automatically accounted for in this scheme. For example, the initial demisyllable excised from CVC may be joined with − VN to make a natural‐sounding CVN syllable. These experiments used both a parallel‐type formant analyzer—synthesizer [J.P. Olive and M.J. Macchi, J. Acoust. Soc. Am. 58, S23(A) (1975)] and an LPC processor, with some retouching, and with minimal smoothing across the concatenative boundary of the parameters.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have