Abstract
Speech is intelligible even when the temporal envelope of speech is distorted. The current study investigates how native and non-native speakers perceptually restore temporally distorted speech. Participants were native English speakers (NS), and native Japanese speakers who spoke English as a second language (NNS). In Experiment 1, participants listened to “locally time-reversed speech” where every x-ms of speech signal was reversed on the temporal axis. Here, the local time reversal shifted the constituents of the speech signal forward or backward from the original position, and the amplitude envelope of speech was altered as a function of reversed segment length. In Experiment 2, participants listened to “modulation-filtered speech” where the modulation frequency components of speech were low-pass filtered at a particular cut-off frequency. Here, the temporal envelope of speech was altered as a function of cut-off frequency. The results suggest that speech becomes gradually unintelligible as the length of reversed segments increases (Experiment 1), and as a lower cut-off frequency is imposed (Experiment 2). Both experiments exhibit the equivalent level of speech intelligibility across six levels of degradation for native and non-native speakers respectively, which poses a question whether the regular occurrence of local time reversal can be discussed in the modulation frequency domain, by simply converting the length of reversed segments (ms) into frequency (Hz).
Highlights
People are capable of perceptually restoring temporally distorted speech
The modulation frequency that is involved in speech intelligibility cannot be computed by just looking at the critical reversed segment length of locally time-reversed speech, and converting the duration into frequency (Hz)
The results suggest that speech becomes gradually unintelligible when every longer segment of speech is flipped in time, and when a lower cutoff frequency is imposed
Summary
People are capable of perceptually restoring temporally distorted speech. Earlier studies suggest that people can perceptually restore a part of speech that is physically missing from the speech signal (Cherry, 1953; Broadbent, 1954; Cherry and Wiley, 1967; Warren, 1970; Warren and Warren, 1970; Warren and Obusek, 1971). As for the acoustic cues, phonemic restoration takes place under certain conditions – when the replacing sound is louder than the original sound (Warren and Warren, 1970; Warren et al, 1972), when the center frequency of the replacing and replaced sound is matched (Warren and Warren, 1970; Warren et al, 1972), and when the replacing sound has similar temporal, spectral, and spatial characteristics as the original sound (Samuel, 1981a,b, 1996; Kashino, 2006) Both native and non-native speakers perceptually restore the missing phoneme when acoustic conditions are met. Previous studies suggest that acoustic characteristics, lexical context, and linguistic coherence are all taken into account consciously or unconsciously, when perceptually restoring the deleted part of speech
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.