Objective.To create highly immersive experiences in virtual reality (VR) it is important to not only include the visual sense but also to involve multimodal sensory input. To achieve optimal results, the temporal and spatial synchronization of these multimodal inputs is critical. It is therefore necessary to find methods to objectively evaluate the synchronization of VR experiences with a continuous tracking of the user.Approach.In this study a passive touch experience was incorporated in a visual-tactile VR setup using VR glasses and tactile sensations in mid-air. Inconsistencies of multimodal perception were intentionally integrated into a discrimination task. The participants' electroencephalogram (EEG) was recorded to obtain neural correlates of visual-tactile mismatch situations.Main results.The results showed significant differences in the event-related potentials (ERP) between match and mismatch situations. A biphasic ERP configuration consisting of a positivity at 120 ms and a later negativity at 370 ms was observed following a visual-tactile mismatch.Significance.This late negativity could be related to the N400 that is associated with semantic incongruency. These results provide a promising approach towards the objective evaluation of visual-tactile synchronization in virtual experiences.