Synthetic speech has been used for decades to test theories regarding human speech perception. Typically, the speech has been constructed to sound as natural as possible given the constraints of the experiment. An alternative strategy has been to examine the perception of intentionally impoverished stimuli such as time‐varying sinusoid (TVS) replicas of speech [Remez et al., Science 212, 947‐950 (1981)]. The TVS signals consist of three tones whose frequencies mimic the formant center frequencies of a natural sentence. They exhibit few of the acoustic properties of natural speech. It has been demonstrated that TVS signals sound extremely unnatural although they are surprisingly intelligible. The goal of the present experiment was to determine some of the general acoustic characteristics of signals that are important for speech perception. This was accomplished by examining the perceptual consequences of adding simple temporal and spectral information to TVS sentences. The TVS signals were amplitude modulated at 100 Hz in order to give them more speechlike acoustic characteristics without giving them fundamental frequencies or harmonic structures. The modulation greatly improved the phonetic intelligibility of the acoustically sparse TVS signal. The modulated signal was also significantly more natural sounding to listeners than the unmodulated TVS signal. Performing this operation on natural speech, however, caused a decrement in intelligibility.