This study addressed the issue of whether the perception of vocal fry depends on waveform shape, or whether fry is perceived categorically only as a function of frequency. Voicelike stimuli were generated with a transmission-line synthesis program [Titze, Transcripts Care Prof. Voice (1983)] that can vary fundamental frequency (F0), open quotient (γ), and speed quotient (δ), while maintaining a realistic glottal shape. In Exp. I, all combinations of F0 (40, 60, 80, and 100 Hz), γ (0.1, 0.35, and 0.6), and δ (1.0, 2.5, 5.0, and 8.0) were judged by ten trained listeners, using a binary fry-no fry forced response method. The results reinforced previous conclusions that F0 is the primary determinant in the perception of vocal fry [H. Hollien, J. Phonet. 2, 125–143 (1974)] regardless of waveshape. Mean % judged as fry was 99.4%, 75.6%, 21.1%, and 2.77% at 40, 60, 80, and 100 Hz, respectively, with chance level performance (50%) predicted at about 70 Hz. Differences among γ and δ values were statistically nonsignificant. Experiment II included only stimuli of 40 and 60 Hz, each F0 containing all combinations of γ and δ values. A round-robin tournament was utilized, whereby each stimulus competed against every other stimulus for a total of 12(N2 − N) A − B comparisons. Although intrajudge reliability was high (r≈0.8), interjudge agreement was not consistently as high, yielding nonsignificant differences among γ and δ means, and suggesting that several waveshapes are perceptually plausible within acceptable F0's.