Abstract

On-line comprehension of natural speech requires segmenting the acoustic stream into discrete linguistic elements. This process is argued to rely on theta-gamma oscillation coupling, which can parse syllables and encode them in decipherable neural activity. Speech comprehension also strongly depends on contextual cues that help predicting speech structure and content. To explore the effects of theta-gamma coupling on bottom-up/top-down dynamics during on-line syllable identification, we designed a computational model (Precoss—predictive coding and oscillations for speech) that can recognise syllable sequences in continuous speech. The model uses predictions from internal spectro-temporal representations of syllables and theta oscillations to signal syllable onsets and duration. Syllable recognition is best when theta-gamma coupling is used to temporally align spectro-temporal predictions with the acoustic input. This neurocomputational modelling work demonstrates that the notions of predictive coding and neural oscillations can be brought together to account for on-line dynamic sensory processing.

Highlights

  • Because the vector If determines the global attractor, sequential activation of the gamma units makes the global attractor change continuously over time and generate the pattern corresponding to syllable ‘ω’ when υ(1)ω = 1 and υ(1)not ω = 0. The outputs of this level are the states of the Hopfield network, which predict the activity of the frequency channels in the input, and the causal state associated with the slow amplitude modulation: υðf 0Þ 1⁄4 xð1Þ þ ηðf1Þ υðA0Þ 1⁄4 υðA1Þ þ ηðA1Þ

  • To define the syllables identified by the model, we considered the time average of the causal state (vω, Eq (18)) of each of the syllable units taken within the boundaries defined by the gamma sequence (Supplementary Fig. 2)

  • Simulations were performed with DEM Toolbox in SPM64 using MATLAB 2018b, The MathWorks, Inc., Natick, Massachusetts, United States

Read more

Summary

Results

The model presented above includes a physiologically motivated theta oscillation that is driven by the slow amplitude modulations of the speech waveform and signals information about syllable onset and duration to a gamma component The simulations further show that the model performed best when syllable units were reset after completion of each gammaunits sequence (based on internal information about the spectral syllable structure), and when the gamma rate was driven by thetagamma coupling irrespective of whether it was stimulus- (red dashed arrow in A) or endogenously-timed (red dashed circle in B). The coupling was more efficient when it arose from a theta oscillator driven by the speech acoustics; the model was marginally more resilient to speech rate variations (Fig. 4), and it had the best accuracy versus complexity trade-off (Fig. 5)

Discussion
A A’ 90 80 70 60
Methods
Code availability
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call