While the perception of segmental cues in speech has been extensively studied, evidence on the perception of suprasegmental cues is largely missing. A critical barrier in this area is the challenge in measuring listeners’ ability to perceive a combination of multiple acoustic cues. The classic paradigm of discrimination is limited by its heavy reliance on auditory short-term memory and a long testing time. This study aims at examining the utility of a novel measure that is potentially more efficient and offers a more fine-grained account of perception of suprasegmental cues. Built on the evidence that listeners can shadow and rapidly imitate speech cues faithfully, we use the degree of alignment between listeners’ rapid imitation of suprasegmental cues and that of the speech stimuli as an index of perception quality. This alignment is mathematically quantified by the dynamic time warping (DTW) method, which is modified to simultaneously account for both the intensity and the fundamental frequency sequences to render more reliable alignment results. Data to date provided rich information regarding individual differences in perception, and results obtained from rapid imitation showed improved utility over those from the discrimination. The theoretical and clinical implication of this work is also discussed.