Abstract

Accurate pitch perception of harmonic complex tones is widely believed to rely on temporal fine structure information conveyed by the precise phase-locked responses of auditory-nerve fibers. However, accurate pitch perception remains possible even when spectrally resolved harmonics are presented at frequencies beyond the putative limits of neural phase locking, and it is unclear whether residual temporal information, or a coarser rate-place code, underlies this ability. We addressed this question by measuring human pitch discrimination at low and high frequencies for harmonic complex tones, presented either in isolation or in the presence of concurrent complex-tone maskers. We found that concurrent complex-tone maskers impaired performance at both low and high frequencies, although the impairment introduced by adding maskers at high frequencies relative to low frequencies differed between the tested masker types. We then combined simulated auditory-nerve responses to our stimuli with ideal-observer analysis to quantify the extent to which performance was limited by peripheral factors. We found that the worsening of both frequency discrimination and F0 discrimination at high frequencies could be well accounted for (in relative terms) by optimal decoding of all available information at the level of the auditory nerve. A Python package is provided to reproduce these results, and to simulate responses to acoustic stimuli from the three previously published models of the human auditory nerve used in our analyses.

Highlights

  • Prevailing theories posit that the auditory system relies on the stimulusdriven timing of spikes in the auditory nerve, termed phase locking, to estimate pitch

  • “Temporal” theories suggest instead that pitch is derived from temporal information, including temporal fine structure (TFS) information encoded in inter-spike intervals by the phase-locking properties of auditory-nerve fibers and other temporal information, such as envelope modulation [13,15,16]

  • Many studies have recorded auditory-nerve responses to the types of Harmonic complex tones (HCTs) stimuli used in pitch experiments [13,20,48,49], these have not included HCTs at the very high frequencies used in recent human psychophysical work [36–38]

Read more

Summary

Introduction

Pitch is a primary perceptual dimension of sound It plays a key role in the perception of music, where it constitutes the basis of melody and harmony [1], as well as in the perception of speech, where it has important suprasegmental functions and conveys information about talker identity [2–4]. “Place” or “rate-place” theories contend that pitch is derived by analysis of the spatial pattern of average firing rates of auditory-nerve fibers, in which information about the frequency content of a stimulus is encoded via the basilar membrane’s frequency-to-place (or tonotopic) mapping [13,14]. “Temporal” theories suggest instead that pitch is derived from temporal information, including temporal fine structure (TFS) information encoded in inter-spike intervals by the phase-locking properties of auditory-nerve fibers and other temporal information, such as envelope modulation [13,15,16]. “Spatiotemporal” or “spectrotemporal” theories, motivated by the fact that neither place nor temporal theories account well for all pitch phenomena, propose that both the frequency-toplace mapping and TFS information play crucial roles in pitch perception [13,17–20]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call