Audio-visual interactions play a crucial role in environmental perception. Incongruent audio-visual environments, although prominent in the urban fabric, have been underlit in research. Contrasting exposures to visually natural environments combined with urban sounds, as well as the reverse scenario of visually urban environments with natural sounds, are explored for their restorative potential, both at the cognitive level by means of self-reported evaluations and instinctively through measurements of electrodermal activity. The test panel (n=67) results point at a strong audio-visual urban-nature incongruency asymmetry. Eventful natural sounds in an urban setting were perceived as more pleasant and relaxing than eventful urban sounds in a green environment. So augmenting an urban environment with natural sounds, even in absence of any visual natural features, could be an interesting soundscape intervention. In contrast, a visually natural environment paired with urban sounds is strongly negatively perceived. The findings further contribute to a growing body of literature emphasizing the importance of the sonic environment in designing restorative spaces.