Abstract

Perception of the external world is based on the integration of inputs from different sensory modalities. Recent experimental findings suggest that this phenomenon is present in lower-level cortical areas at early processing stages. The mechanisms underlying these early processes and the organization of the underlying circuitries are still a matter of debate. Here, we investigate audiovisual interactions by means of a simple neural network consisting of two layers of visual and auditory neurons. We suggest that the spatial and temporal aspects of audio-visual illusions can be explained within this simple framework, based on two main assumptions: auditory and visual neurons communicate via excitatory synapses; and spatio-temporal receptive fields are different in the two modalities, auditory processing exhibiting a higher temporal resolution, while visual processing a higher spatial acuity. With these assumptions, the model is able: i) to simulate the sound-induced flash fission illusion; ii) to reproduce psychometric curves assuming a random variability in some parameters; iii) to account for other audio-visual illusions, such as the sound-induced flash fusion and the ventriloquism illusions; and iv) to predict that visual and auditory stimuli are combined optimally in multisensory integration. In sum, the proposed model provides a unifying summary of spatio-temporal audio-visual interactions, being able to both account for a wide set of empirical findings, and be a framework for future experiments. In perspective, it may be used to understand the neural basis of Bayesian audio-visual inference.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call