Abstract

Neural encoding, a crucial aspect to understand the human brain information processing system, aims to establish a quantitative relationship between the stimuli and the evoked brain activities. In the field of visual neuroscience, with the ability to explain how neurons in the primary visual cortex work, population receptive field (pRF) models have enjoyed high popularity and made reliable progress in recent years. However, existing models rely on either the inflexible prior assumptions about pRF or the clumsy parameter estimation methods, severely limiting the expressiveness and interpretability. In this article, we propose a novel neural encoding framework by learning “what” and “where” with deep neural networks. It involves two separate aspects: 1) the spatial characteristic (“where”) and 2) feature selection (“what”) of neuron populations in the visual cortex. Specifically, our approach first encodes visual stimuli into hierarchically intermediate features through a pretrained deep neural network (DNN), then converts DNN features into refined features with the channel attention and spatial receptive field (RF) to learn “where”, and finally regresses refined features simultaneously onto voxel activities to learn “what”. The sparsity regularization and smoothness regularization are adopted in our modeling approach so that the crucial RF can be estimated automatically without prior assumptions about shapes. Furthermore, an attempt is made to extend the voxel-wise modeling approach to multi-voxel joint encoding models, and we show that it is conducive to rescuing voxels with poor signal-to-noise characteristics. Extensive empirical results demonstrate that the method developed herein provides an effective strategy to establish neural encoding for the human visual cortex, with the weaker prior constraints but the higher encoding performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call