A saliency map is the bottom-up contribution to the deployment of exogenous attention. It, as well as its underlying neural mechanism, is hard to identify because of the influence of top-down signals. A recent study showed that neural activities in V1 could create a bottom-up saliency map (Zhang et al. in Neuron 73(1):183-192, 2012). In this paper, we tested whether their conclusion can generalize to complex natural scenes. In order to avoid top-down influences, each image was presented with a low contrast for only 50ms and was followed by a high contrast mask, which rendered the whole image invisible to participants (confirmed by a forced-choice test). The Posner cueing paradigm was adopted to measure the spatial cueing effect (i.e., saliency) by an orientation discrimination task. A positive cueing effect was found, and the magnitude of the cueing effect was consistent with the saliency prediction of a computational saliency model. In a following fMRI experiment, we used the same masked natural scenes as stimuli and measured BOLD signals responding to the predicted salient region (relative to the background). We found that the BOLD signal in V1, but not in other cortical areas, could well predict the cueing effect. These results suggest that the bottom-up saliency map of natural scenes could be created in V1, providing further evidence for the V1 saliency theory (Li in Trends Cogn Sci 6(1):9-16, 2002).