Listeners parse the speech signal effortlessly into words and phrases, but many questions remain about how. One classic idea is that rhythm-related auditory principles play a role, in particular, that a psycho-acoustic "iambic-trochaic law" (ITL) ensures that alternating sounds varying in intensity are perceived as recurrent binary groups with initial prominence (trochees), while alternating sounds varying in duration are perceived as binary groups with final prominence (iambs). We test the hypothesis that the ITL is in fact an indirect consequence of the parsing of speech along two in-principle orthogonal dimensions: prominence and grouping. Results from several perception experiments show that the two dimensions, prominence and grouping, are each reliably cued by both intensity and duration, while foot type is not associated with consistent cues. The ITL emerges only when one manipulates either intensity or duration in an extreme way. Overall, the results suggest that foot perception is derivative of the cognitively more basic decisions of grouping and prominence, and the notions of trochee and iamb may not play any direct role in speech parsing. A task manipulation furthermore gives new insight into how these decisions mutually inform each other.
Read full abstract