Many affective computing studies have developed automatic emotion recognition models, mostly using emotional images, audio and videos. In recent years, virtual reality (VR) has been also used as a method to elicit emotions in laboratory environments. However, there is still a need to analyse the validity of VR in order to extrapolate the results it produces and to assess the similarities and differences in physiological responses provoked by real and virtual environments. We investigated the cardiovascular oscillations of 60 participants during a free exploration of a real museum and its virtualisation viewed through a head-mounted display. The differences between the heart rate variability features in the high and low arousal stimuli conditions were analysed through statistical hypothesis testing; and automatic arousal recognition models were developed across the real and the virtual conditions using a support vector machine algorithm with recursive feature selection. The subjects' self-assessments suggested that both museums elicited low and high arousal levels. In addition, the real museum showed differences in terms of cardiovascular responses, differences in vagal activity, while arousal recognition reached 72.92% accuracy. However, we did not find the same arousal-based autonomic nervous system change pattern during the virtual museum exploration. The results showed that, while the direct virtualisation of a real environment might be self-reported as evoking psychological arousal, it does not necessarily evoke the same cardiovascular changes as a real arousing elicitation. These contribute to the understanding of the use of VR in emotion recognition research; future research is needed to study arousal and emotion elicitation in immersive VR.