Differential contributions of synaptic and intrinsic inhibitory currents to speech segmentation via flexible phase-locking in neural oscillators.

Benjamin R Pittman-Polletta,Nancy J Kopell,Miles A Whittington,David A Stanley,Charles E Schroeder,Yangyang Wang,Boris S Gutkin

doi:10.1371/journal.pcbi.1008783

Abstract

Current hypotheses suggest that speech segmentation-the initial division and grouping of the speech stream into candidate phrases, syllables, and phonemes for further linguistic processing-is executed by a hierarchy of oscillators in auditory cortex. Theta (∼3-12 Hz) rhythms play a key role by phase-locking to recurring acoustic features marking syllable boundaries. Reliable synchronization to quasi-rhythmic inputs, whose variable frequency can dip below cortical theta frequencies (down to ∼1 Hz), requires "flexible" theta oscillators whose underlying neuronal mechanisms remain unknown. Using biophysical computational models, we found that the flexibility of phase-locking in neural oscillators depended on the types of hyperpolarizing currents that paced them. Simulated cortical theta oscillators flexibly phase-locked to slow inputs when these inputs caused both (i) spiking and (ii) the subsequent buildup of outward current sufficient to delay further spiking until the next input. The greatest flexibility in phase-locking arose from a synergistic interaction between intrinsic currents that was not replicated by synaptic currents at similar timescales. Flexibility in phase-locking enabled improved entrainment to speech input, optimal at mid-vocalic channels, which in turn supported syllabic-timescale segmentation through identification of vocalic nuclei. Our results suggest that synaptic and intrinsic inhibition contribute to frequency-restricted and -flexible phase-locking in neural oscillators, respectively. Their differential deployment may enable neural oscillators to play diverse roles, from reliable internal clocking to adaptive segmentation of quasi-regular sensory inputs like speech.

Highlights

Conventional models of speech processing [1,2,3] suggest that decoding proceeds by matching chunks of speech of different durations with stored linguistic memory patterns or templates
Θ rhythmicity was paced by either or both of two mechanisms: synaptic inhibition with a fast rise time and a slow decay time as in the hippocampus [48] and previous models of syllable segmentation [45]; and θ-frequency sub-threshold oscillations (STOs) resulting from the interaction of a pair of intrinsic currents activated at subthreshold membrane potentials—a depolarizing persistent sodium current and a hyperpolarizing and slowly activating m-current [49]
We began by qualitatively matching in vitro recordings from layer 5 θ-resonant pyramidal cells [50] (Fig 2). As their resting membrane potential is raised over a few mV, these regular spiking (RS) cells exhibit a characteristic transition from tonic δ-rhythmic spiking to tonic θ-rhythmic spiking through so-called mixed-mode oscillations (MMOs, here doublets of spikes spaced a θ period apart occurring at a δ frequency) [50]

Summary

Introduction

Conventional models of speech processing [1,2,3] suggest that decoding proceeds by matching chunks of speech of different durations with stored linguistic memory patterns or templates. Speech is a multiscale phenomenon, but both the amplitude modulation of continuous speech and the motor physiology of the speech apparatus are dominated by syllabic timescales —i.e., δ/θ frequencies ( 1-9 Hz) [23,24,25,26,27] This syllabic timescale information is critical for speech comprehension [11, 12, 26, 28,29,30,31], as is speech-brain entrainment at δ/θ frequencies [32,33,34,35,36,37,38], which may play a causal role in speech perception [39,40,41,42]. The fact that oscillator-based syllable boundary detection performs better than classical algorithms [45, 46] argues for the role of endogenous rhythmicity—as opposed to merely event-related responses to rhythmic inputs—in speech segmentation and perception

Methods

Results

Conclusion