The Sensory-Dependent Nature of Audio-Visual Interactions for Semantic Knowledge Guillaume Vallet (guillaume.vallet@univ-lyon2.fr) Universit´e Lumi`ere Lyon 2. Laboratoire EMC, 5 avenue Pierre Mend`es France, 69676, Bron cedex, France & Laval University, School of Psychololgy, 2325 rue des Biblioth`eques Quebec City (Quebec), G1V 0A6 Canada Benoit Riou (benoit.riou@univ-lyon2.fr), R´emy Versace (remy.versace@univ-lyon2.fr) Universit´e Lumi`ere Lyon 2. Laboratoire EMC, 5 avenue Pierre Mend`es France, 69676, Bron cedex, France Martine Simard (martine.simard@psy.ulaval.ca) Laval University, School of Psychololgy, 2325 rue des Biblioth`eques Quebec City (Quebec), G1V 0A6 Canada Abstract The nature of audio-visual interactions is poorly understood for meaningful objects. These interactions would be indirect through semantic memory according to the amodal nature of knowledge, whereas these interactions would be direct accord- ing to the modal nature of knowledge. This question, cen- tral for both memory and multisensory frameworks, was as- sessed using a cross-modal priming paradigm from auditory to visual modalities tested on familiar objects. For half of the sound primes, a visual abstract mask was simultaneously pre- sented to the participants. The results showed a cross-modal priming effect for semantically congruent objects compared to semantically incongruent objects presented without the mask. The mask interfered in the semantically congruent condition, but had no effect in the semantically incongruent condition. The semantic specificity of the mask effect demonstrates a memory-related effect. The results suggest that audio-visual interactions are direct. The data support the modal approach of knowledge and the grounded cognition theory. Keywords: Memory; Perception; Audio-visual; Masking; Priming; Grounded Cognition. Introduction Our environment is filled with meaningful objects repre- senting semantic knowledge. These objects are perceptually processed using several sensory channels in which the auditory and visual modalities dominate the other senses in Human (for a review see Spence, 2007). The sensory infor- mation is mainly integrated on the basis of the temporal and spatial relationships between the stimuli (Calvert & Thesen, 2004), and also on the basis of the semantic relationships existing between them (Laurienti, Kraft, Maldjian, Burdette, & Wallace, 2004). Yet it remains uncertain how semantic memory aspects are involved in multisensory perception (for a review see Doehrmann & Naumer, 2008). This issue depends on the perceptual or semantic nature of cross-modal interactions, and thus questions the modal or amodal nature of knowledge (Vallet, Brunel, & Versace, 2010). The present study therefore aims at assessing the nature of audio-visual interactions using an innovative masking procedure. Communication between different modalities is called in- teraction (or interplay). If this interaction involves a repre- sentation of higher level, this interaction is called integration (Driver & Noesselt, 2008). An integrated object is a represen- tation that is more than the sum of its part. Previous research in the multisensory perception theoretical framework princi- pally studied the neural basis of the integration mechanism using meaningless stimuli (for review see Calvert & Thesen, 2004; Koelewijn, Bronkhorst, & Theeuwes, 2010). Fewer studies were conducted with meaningful stimuli, and the goal of these studies was also to determine the brain substrates of multisensory integration (Doehrmann & Naumer, 2008). The semantic constraint is generally assessed by manipulating the semantic congruency. A congruent trial is when the prime and the target refer to the same semantic object (meowing sound - cat’s picture). Semantic congruent stimuli usually fa- cilitate information processing (Chen & Spence, 2010), and may enhance memory performances in semantic (Laurienti et al., 2004) and episodic tasks (Lehmann & Murray, 2005). In the memory theoretical framework, cross-modal inter- actions tested on meaningful stimuli are generally studied by inserting a delay between the stimuli. The most famous paradigm in this field is the cross-modal priming paradigm. The cross-modal priming effect is the facilitation of the pro- cessing of one stimulus in one modality (the target) by the previous presentation of another stimulus in another modal- ity (the prime). The cross-modal priming effect may be ob- served between different modalities, such as the haptic and vi- sual modalities (Easton, Srinivas, & Greene, 1997), but most of the studies were realized between the auditory and visual modalities (for a review see Schneider, Engel, & Debener, 2008). The increasing number of studies on the audio-visual interactions involving meaningful stimuli are aimed at a bet- ter understanding of these effects. Nevertheless, the nature of audio-visual interactions, which is the central issue underly- ing these effects, remains poorly understood. The nature of these interactions depends on the nature