Abstract

Spatial release from masking refers to a benefit for speech understanding. It occurs when a target talker and a masker talker are spatially separated. In those cases, speech intelligibility for target speech is typically higher than when both talkers are at the same location. In cochlear implant listeners, spatial release from masking is much reduced or absent compared with normal hearing listeners. Perhaps this reduced spatial release occurs because cochlear implant listeners cannot effectively attend to spatial cues. Three experiments examined factors that may interfere with deploying spatial attention to a target talker masked by another talker. To simulate cochlear implant listening, stimuli were vocoded with two unique features. First, we used 50-Hz low-pass filtered speech envelopes and noise carriers, strongly reducing the possibility of temporal pitch cues; second, co-modulation was imposed on target and masker utterances to enhance perceptual fusion between the two sources. Stimuli were presented over headphones. Experiments 1 and 2 presented high-fidelity spatial cues with unprocessed and vocoded speech. Experiment 3 maintained faithful long-term average interaural level differences but presented scrambled interaural time differences with vocoded speech. Results show a robust spatial release from masking in Experiments 1 and 2, and a greatly reduced spatial release in Experiment 3. Faithful long-term average interaural level differences were insufficient for producing spatial release from masking. This suggests that appropriate interaural time differences are necessary for restoring spatial release from masking, at least for a situation where there are few viable alternative segregation cues.

Highlights

  • In natural acoustic settings, source segregation can be difficult, resulting in deteriorated speech intelligibility even for normal hearing listeners

  • spatial release from masking (SRM) in cochlear implant listeners was smaller for speech in speech than for speech in noise. This was observed despite preserving both interaural level differences (ILDs) and across-electrode onset interaural time differences (ITDs). These findings indicate that high-fidelity, or ‘‘clean,’’ long-term ILDs alone do not suffice for directing spatial attention when short-term ITDs and ILDs fluctuate dynamically and that listeners need access to faithfully encoded ITD cues ([22],[24])

  • Acoustic spatial cues differed distinctly between the two locations: Across the entire sound spectrum, ITDs and ILDs in the front condition were close to 0, whereas in the side condition, ITDs equaled around 800 ms and ILDs increased with increasing frequency, covering a range between 5 and 25 dB

Read more

Summary

Introduction

Source segregation can be difficult, resulting in deteriorated speech intelligibility even for normal hearing listeners. When the target is to the left, and masker in front, the TMR in the left ear is higher than the TMR in the right ear In addition to this monaural benefit, binaural decorrelation processing can combat energetic masking by allowing a listener to detect a target when the interaural time differences (ITDs) between target and masker differ (e.g., [3]). For attention-driven SRM to occur, at least for normal hearing listeners, spatial separation between competing sources and resulting binaural differences do not need to be large, provided that listeners can perceive the sounds at spatially distinct locations and attend to the direction of the target source ([7], [8], [9], [10])

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call