Abstract

The ability to speak is probably our most complex cognitive-motor skill. It is, moreover, a uniquely human and a universal skill. In speaking, myriad processes involving a wide range of cerebral structures cooperate in the generation of a temporally organized structure, an articulatory pattern that has overt speech as its physical-acoustic effect. The temporal organization of speech is multileveled. There are, on the one hand, the relatively slow strategic processes involved in planning the speech act. When we speak, our attention is almost fully dedicated to what we say. How we say it largely takes care of itself. Words, for instance, are produced at a speed of about 2 per second, but so-called anacruses are possible of up to 7 words per second. At this rate we retrieve lexical items from a mental lexicon that contains thousands, and probably tens of thousands of items. In fluent speech our average syllabic rate is about 3 per second, whereas individual speech sounds come as fast as 10 to 15 phonemes per second. And normally, all this happens without any attentional control. These high-speech automatic processes are, moreover, surprisingly error proof in normals. Estimates of the rate of lexical selection errors range around one per thousand, whereas phonemic errors are even rarer. What are the mechanisms that subserve this perfect, multilevel timing in speech production? In the following I will discuss some recent research in our laboratory that is concerned with the time course of spoken word production at three levels of processing, as depicted in FIGURE 1. The first one concerns lexical selection, the second one phonological encoding and syllabification, and the third one phonetic encoding, in particular the retrieval of syllabic gestural scores.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call