Abstract

Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss.

Highlights

  • Vowels carry a heavy functional load in all languages, especially in running speech and discourse

  • This response shows that reductions in the discharge rate of BP responses (Fig. 4D, blue) are ambiguous, as they may be due either to reduced fluctuations of auditory nerve (AN) responses tuned near formants (Fig. 1B) or to reduced spectral energy (Fig. 4D, arrow, 1500 Hz)

  • This ambiguity is resolved by the LPBR model (Fig. 4D, red), which only responds when both sufficient energy and reduced fluctuations are present on the inputs to the model midbrain cell

Read more

Summary

Introduction

Vowels carry a heavy functional load in all languages, especially in running speech and discourse. Studies of auditory nerve (AN) speech coding typically focus on response rates or temporal synchrony at frequencies to which a fiber is most sensitive (Sachs and Young, 1979; Young and Sachs, 1979; Delgutte and Kiang, 1984; Schilling et al, 1998). These codes are adequate for low-level speech sounds in quiet, but they fail for moderate-to-high sound levels and in background noise. The vowel-coding hypothesis tested here focuses on the F0-related neural fluctuations and on contrasts in their amplitudes across neurons tuned to different frequencies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call