Hierarchical temporal structuring of speech is the key to multiscale linguistic information transfer toward effective communication. This study investigated and linked the hierarchical temporal cues of the kinematic and acoustic modalities of natural, unscripted speech in neurologically healthy and impaired speakers. Thirteen individuals with amyotrophic lateral sclerosis (ALS) and 10 age-matched healthy controls performed a story-telling task. The hierarchical temporal structure of the speech stimulus was measured by (a) 26 articulatory-kinematic features characterizing the depth, phase synchronization, and coherence of temporal modulation of the tongue tip, tongue body, lower lip, and jaw, at three hierarchically nested timescales corresponding to prosodic stress, syllables, and onset-rime/phonemes, and (b) 25 acoustic features characterizing the parallel aspects of temporal modulation of five critical-spectral-band envelopes. All features were compared between groups. For each aspect of temporal modulation, the contributions of all articulatory features to the parallel acoustic features were evaluated by group. Generally consistent disease impacts were identified on the articulatory and acoustic features, manifested by reduced modulation depths of most articulators and critical-spectral-band envelopes, primarily at the timescales of syllables and onset-rime/phonemes. For healthy speakers, the strongest articulatory-acoustic relationships were found for (a) jaw and lip, in modulating stress timing, and (b) tongue tip, in modulating the timing relation between onset-rime/phonemes and syllables. For speakers with ALS, the tongue body, tongue tip, and jaw all showed the greatest contributions to modulating syllable timing. The observed disease impacts likely reflect reduced entrainment of speech motor activities to finer-grained linguistic events, presumably due to the dynamic constraints of the neuromuscular system. To accommodate these restrictions, speakers with ALS appear to use their residual articulatory motor capacities to accentuate and convey the perceptually most salient temporal cues underpinned by the syllable-centric parsing mechanism. This adaptive strategy has potential implications in managing neuromotor speech disorders.