Abstract
This study explored the role of formant transitions and F0-contour continuity in binding together speech sounds into a coherent stream. Listening to a repeating recorded word produces verbal transformations to different forms; stream segregation contributes to this effect and so it can be used to measure changes in perceptual coherence. In experiment 1, monosyllables with strong formant transitions between the initial consonant and following vowel were monotonized; each monosyllable was paired with a weak-transitions counterpart. Further stimuli were derived by replacing the consonant-vowel transitions with samples from adjacent steady portions. Each stimulus was concatenated into a 3-min-long sequence. Listeners only reported more forms in the transitions-removed condition for strong-transitions words, for which formant-frequency discontinuities were substantial. In experiment 2, the F0 contour of all-voiced monosyllables was shaped to follow a rising or falling pattern, spanning one octave. Consecutive tokens either had the same contour, giving an abrupt F0 change between each token, or alternated, giving a continuous contour. Discontinuous sequences caused more transformations and forms, and shorter times to the first transformation. Overall, these findings support the notion that continuity cues provided by formant transitions and the F0 contour play an important role in maintaining the perceptual coherence of speech.
Highlights
Perception of the temporal order of speech sounds is a non-trivial problem, even when heard in quiet. This is because the speech signal consists of rapidly changing and diverse acoustic elements e a signal that has been described as a Abbreviations: CVC, consonant-vowel-consonant; F0, fundamental frequency; PSOLA, Pitch Synchronous Overlap and Add method; VTE, verbal transformation effect
A two-way analysis of variance (ANOVA) was performed on each measure and the statistical outcomes are presented in Table 1; significant effects are indicated on Fig. 2
According to the experimental hypothesis, the critical outcome is the interaction term. This is because removing formant transitions should cause a greater reduction in perceptual coherence for strong- than weaktransitions words, which should manifest as a greater rise in the number of verbal transformations (VTs) and forms and a greater reduction in the time to the first VT
Summary
The intelligibility of speech depends on the ability of the listener to identify the constituent phonetic segments, and to perceive them in the correct order. Perception of the temporal order of speech sounds is a non-trivial problem, even when heard in quiet. This is because the speech signal consists of rapidly changing and diverse acoustic elements e a signal that has been described as a Abbreviations: CVC, consonant-vowel-consonant; F0, fundamental frequency; PSOLA, Pitch Synchronous Overlap and Add method; VTE, verbal transformation effect. It is much harder to judge the relative timing of sounds that form part of separate streams rather than the same stream (Bregman and Campbell, 1971; Roberts et al, 2002) This makes sense from the perspective of auditory scene analysis (Bregman, 1990), because sounds falling in different streams are interpreted as arising from independent sources and so their relative timing is seen as accidental rather than a meaningful property of a stimulus. Despite the greater similarity between sounds, the accuracy of order judgments fell to near chance for items 100 ms long (i.e., !10 phonemes/s)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.