Abstract

Recent hypotheses on the potential role of neuronal oscillations in speech perception propose that speech is processed on multi-scale temporal analysis windows formed by a cascade of neuronal oscillators locked to the input pseudo-rhythm. In particular, Ghitza (2011) proposed that the oscillators are in the theta, beta, and gamma frequency bands with the theta oscillator the master, tracking the input syllabic rhythm and setting a time-varying, hierarchical window structure synchronized with the input. In the study described here the hypothesized role of theta was examined by measuring the intelligibility of speech with a manipulated modulation spectrum. Each critical-band signal was manipulated by controlling the degree of temporal envelope flatness. Intelligibility of speech with critical-band envelopes that are flat is poor; inserting extra information, restricted to the input syllabic rhythm, markedly improves intelligibility. It is concluded that flattening the critical-band envelopes prevents the theta oscillator from tracking the input rhythm, hence the disruption of the hierarchical window structure that controls the decoding process. Reinstating the input-rhythm information revives the tracking capability, hence restoring the synchronization between the window structure and the input, resulting in the extraction of additional information from the flat modulation spectrum.

Highlights

  • There is a remarkable correspondence between the time span of phonetic, syllabic, and phrasal units, on the one hand, and the frequency range of the gamma, beta, theta, and delta neuronal oscillations, on the other

  • Intelligibility of the last four digits in seven-digit sequences was measured as a function of judiciously manipulated changes in critical-bands envelope flatness, while attending to the disassociation between parsing and decoding

  • We found that the intelligibility of stimuli with flat critical-band envelopes is poor

Read more

Summary

Introduction

There is a remarkable correspondence between the time span of phonetic, syllabic, and phrasal units, on the one hand, and the frequency range of the gamma, beta, theta, and delta neuronal oscillations, on the other. The key property that enabled an explanation of the behavioral data was the capability of the window structure to stay synchronized with the input; performance is high so long as the oscillators are locked to the input rhythm (and within their intrinsic frequency range), and it drops once the oscillators are out of lock (e.g., hit their boundaries). This computation principle was realized by the phenomenological model shown, termed Tempo. Note that the term “parsing” as employed here does not refer to the www.frontiersin.org

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call