Abstract

In multi-talker listening environments, the culmination of different voice streams may lead to the distortion of each source’s individual message, causing deficits in comprehension. Voice characteristics, such as pitch and timbre, are major dimensions of auditory perception and play a vital role in grouping and segregating incoming sounds based on their acoustic properties. The current study investigated how pitch and timbre cues (determined by fundamental frequency, notated as F0, and spectral slope, respectively) can affect perceptual integration and segregation of complex-tone sequences within an auditory streaming paradigm. Twenty normal-hearing listeners participated in a traditional auditory streaming experiment using two alternating sequences of harmonic tone complexes A and B with manipulating F0 and spectral slope. Grouping ranges, the F0/spectral slope ranges over which auditory grouping occurs, were measured with various F0/spectral slope differences between tones A and B. Results demonstrated that the grouping ranges were maximized in the absence of the F0/spectral slope differences between tones A and B and decreased by 2 times as their differences increased to ±1-semitone F0 and ±1-dB/octave spectral slope. In other words, increased differences in either F0 or spectral slope allowed listeners to more easily distinguish between harmonic stimuli, and thus group them together less. These findings suggest that pitch/timbre difference cues play an important role in how we perceive harmonic sounds in an auditory stream, representing our ability to group or segregate human voices in a multi-talker listening environment.

Highlights

  • Multiple sound sources are often simultaneously active in everyday listening environments

  • We explore the perceptual cues of pitch and timbre while varying physical parameters F0 and spectral slope, respectively, in the harmonic complex stimuli to observe their effects on a normal hearing (NH) listener’s ability to group and segregate sound sources in an auditory streaming paradigm

  • The results showed that the average F0 grouping range was 54 ± 18 Hz in the absence of spectral slope differences between A and B (i.e., −1 dB/octave spectral slope of B)

Read more

Summary

Introduction

Multiple sound sources are often simultaneously active in everyday listening environments. In a multi-talker listening environment, the auditory system must be able to distinguish the target voice from all the others and isolate it. This allows the listener to understand, process, and properly respond. This is an important ability that is not always easy, as sounds coming from multiple sources often blend together. This ability to focus one’s auditory attention on a single stimulus amidst several other competing stimuli is well known as the “cocktail party effect” (Cherry, 1953).

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call