Abstract

Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored.

Highlights

  • Temporal-envelope information is crucial for speech perception [1,2,3,4,5,6,7,8,9,10]

  • Temporal-envelope (TE) information but not temporal fine structure (TFS) is preserved when hearing is restored through cochlear implants, and listeners can learn to use the impoverished

  • Erb et al [17, 18] found that listeners who were better at discriminating amplitude-modulation (AM) rates were quicker to improve in their perception of vocoded speech, which is often used to simulate the signal as heard through a cochlear implant

Read more

Summary

Introduction

Temporal-envelope information is crucial for speech perception [1,2,3,4,5,6,7,8,9,10]. Listeners’ performance on a noise-vocoded consonant recognition task (consonants embedded in vowel-consonant-vowel contexts) was the best tested predictor of performance in other vocoded linguistic tasks (e.g., word and sentence recognition) These findings are in line with the results obtained using tone- and noisevocoded speech stimuli by Loebach et al [26,27,28], who postulated that training in low-level acoustic-phonetic listening offers the most promising route for adaptation to distorted speech. We trained listeners on either detection of TE modulation (sinusoidal AM) or the discrimination of the AM rates, and tested for improvement in perception of vocoded vowel-consonant-vowel (VCV) ‘nonsense’ syllables These sub-lexical stimuli were chosen to minimize the influence of higher-level linguistic structure (e.g., semantic or syntactic) on learning effects. We predicted that (1) listeners’ performance on AM tasks prior to training will be correlated with their ability to quickly adapt to vocoded VCV segments on initial exposure (pre-training), and (2) training on TE cues without speech content would improve vocoded consonant identification when compared to untrained, control listeners

Ethics statement
Participants
Results and discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call