It is known that the voicing contrast for Japanese word-initial stops is primarily realized as differences in the voice onset time (VOT). However, recent studies have reported that voiced stops are more often produced with a positive VOT than with a negative VOT among the younger generation nationwide. It is also known that post-stop F0 is associated with the stop contrast, but the degree of F0 use differs from region to region. This study explores whether the difference in post-stop F0 functions as a perceptual cue to the stop contrast along with VOT. Fifty-five college students who are native listeners from four different regions participated in two or three perception tests. The results show that VOT is a primary cue to the voiced-voiceless distinction of word-initial stops, but that the effect of post-stop F0 on the stop contrast is marginal. The post-stop F0 is involved in perception only when VOT is ambiguous, such that a sound with high F0 is more often perceived as a voiceless stop, but not vice versa. The results of this study indicate that the acoustic parameters associated with the stop contrast are not the same in production and perception, and suggest that other factors such as context, which is not an acoustic characteristic, may also be involved in the stop contrast.