Individuals suffering from vision loss of a peripheral origin may learn to understand spoken language at a rate of up to about 22 syllables per second (syl/s), exceeding by far the maximum performance level of untrained listeners (ca. 8 syl/s). Previous findings (Dietrich et al., 2013, BMC Neuroscience) indicate that, in addition to the classical language zones (left inferior frontal gyrus, bilateral middle temporal gyrus), right visual cortex (V1), left supplementary motor area (SMA), and cerebellum contribute to the processing of accelerated speech in blind subjects. As an extension, the present training experiment using functional magnetic resonance imaging (fMRI) addresses the issue whether acquisition of ultra-fast (18 syl/s) speech perception skills induces central-visual hemodynamic activation in late-blind participants. Furthermore, we asked to what extent subjects with normal or residual vision can improve understanding of accelerated verbal utterances and whether differential training effects for sighted as compared to blind subjects can be demonstrated. To these ends, prior to and after a training period of ca. six months, fMRI was performed while subjects were listening to forward and time-reversed sentence utterances of a moderately fast (8 syl/s) or an ultra-fast syllable rate (18 syl/s). Eight participants, varying in the amount of residual visual functions, considerably improved their comprehension of ultra fast speech comprehension (pre-training: 9%, SD=13.4, post-training: 70%, SD=16.9; correct words in a sentence repetition task). Among other regions, left SMA, bilateral cerebellum, and right V1 showed a significant interaction (ANOVA) of the factors rate (8 vs. 18 syl/s) and stage (untrained vs. trained). After training, during the ultra-fast speech condition all subjects displayed an increase of hemodynamic activation in left SMA, and cerebellar activation was increased in five of eight participants. Training-induced hemodynamic activation of the central-visual system was only found in subjects with very low visual acuity. Thus, perceptual learning seems to involve SMA-cerebellar circuits, primarily known from motor learning tasks. It can be assumed that SMA and cerebellum contribute to a mechanism of time-critical processing that is required for the reconstruction of the syllabic structure ultra-fast speech during lexical encoding toward verbal working memory. Furthermore, in spite of similar behavioral performance, trained experts appear to use distinct strategies of ultra-fast speech processing depending on whether the occipital cortex is still deployed for visual processing (Fig. 1).