The objective was to develop and evaluate a new sentence test, the Sentence Test with Adaptive Randomized Roving levels, intended to emulate everyday listening experience, using both normal-hearing (NH) and cochlear implant (CI) groups, examining practicality, learning, test-retest variability, and interlist variability. In experiment 1, each of 25 NH adults was tested using five lists, each comprising 30 sentences. One male and one female speaker each spoke 15 sentences. Ten sentences were presented at each of three presentation levels: 50, 65, and 80 dB SPL. The relative level of a speech-shaped noise was varied adaptively to estimate the speech reception threshold (SRT). Counterbalance for list order was achieved by staggering the allocation of lists to participants. To allow assessment of learning effects, no practice was given. The variability of mean SRTs across lists was small, but correction factors were derived for each list so that, after correction, all lists gave the same mean SRT. Test-retest variability was estimated by examining the corrected SRTs for each subject's five lists. In experiment 2, 25 CI users each received one test list after a small amount of practice. Experiment 3 examined the effect of speech rate using time-compressed speech, for age-matched NH and CI users. The mean SRT for the NH participants was approximately -6 dB and was similar for the male and female speakers. There was a small but significant improvement in SRTs between the first and later lists administered, but no further improvement for subsequent lists. On the basis of the variability of the corrected SRTs within each participant, a 2.2 dB difference in SRT is meaningful for comparisons using one test list per condition, for a single participant. The percentage of key words correct varied with presentation level over a 13% range, being best at 65 dB SPL. Only 40% of the CI group achieved an SRT lower than 20 dB for both speakers. There was large individual variability in the SRTs, and SRTs were higher for the female than for the male speaker. For the CI participants, the percentage of key words correct varied markedly with level, from 19% at the lowest level to 57% at the medium level. Time compression had a small effect for NH participants but a very large effect for CI participants. The Sentence Test with Adaptive Randomized Roving levels seems practical to administer and is reasonably sensitive. For NH participants, a 2.2 dB difference in SRT is meaningful for a single list per condition and a single participant. Although learning effects were small for NH participants, it seems prudent to provide some practice sentences when testing hearing-impaired or CI participants. The very large effect of time compression for the CI group has implications for live voice testing of children, because speech rate is only poorly controlled in such testing.