A Visual Encoding Model Based on Contrastive Self-Supervised Learning for Human Brain Activity along the Ventral Visual Stream.

Jingwei Li,Lulu Hu,Bin Yan,Li Tong,Linyuan Wang,Chi Zhang,Penghui Ding

doi:10.3390/brainsci11081004

Jingwei Li, Lulu Hu + Show 5 more

Open Access

https://doi.org/10.3390/brainsci11081004

Copy DOI

Journal: Brain Sciences	Publication Date: Jul 29, 2021
Citations: 4	License type: CC BY 4.0

Affiliation: PLA Information Engineering University

Abstract

Visual encoding models are important computational models for understanding how information is processed along the visual stream. Many improved visual encoding models have been developed from the perspective of the model architecture and the learning objective, but these are limited to the supervised learning method. From the view of unsupervised learning mechanisms, this paper utilized a pre-trained neural network to construct a visual encoding model based on contrastive self-supervised learning for the ventral visual stream measured by functional magnetic resonance imaging (fMRI). We first extracted features using the ResNet50 model pre-trained in contrastive self-supervised learning (ResNet50-CSL model), trained a linear regression model for each voxel, and finally calculated the prediction accuracy of different voxels. Compared with the ResNet50 model pre-trained in a supervised classification task, the ResNet50-CSL model achieved an equal or even relatively better encoding performance in multiple visual cortical areas. Moreover, the ResNet50-CSL model performs hierarchical representation of input visual stimuli, which is similar to the human visual cortex in its hierarchical information processing. Our experimental results suggest that the encoding model based on contrastive self-supervised learning is a strong computational model to compete with supervised models, and contrastive self-supervised learning proves an effective learning method to extract human brain-like representations.

Highlights

Understanding how the human brain functions is a subject that neuroscientists are constantly exploring, and the visual system is one of the most widely and deeply studied sensory systems [1]
The visual encoding model based on Functional magnetic resonance imaging (fMRI) is a mathematical model that simulates the process of brain visual information processing to predict fMRI activity for any visual input stimulus based on a known or assumed visual perception mechanism, and it describes the relationship between visual inputs and fMRI responses [5,6]
The fMRI data were divided into seven distinct visual regions of interest, including V1, V2, V3, V4, the lateral occipital complex (LOC), the parahippocampal place area (PPA), and the fusiform face area (FFA)

Summary

Introduction

Understanding how the human brain functions is a subject that neuroscientists are constantly exploring, and the visual system is one of the most widely and deeply studied sensory systems [1]. The visual encoding model based on fMRI is a mathematical model that simulates the process of brain visual information processing to predict fMRI activity for any visual input stimulus based on a known or assumed visual perception mechanism, and it describes the relationship between visual inputs and fMRI responses [5,6]. Visual information is processed by a cascade of neural computations [7,8] This process is extremely complex; the mapping from the input stimulus space to the brain activity space can be regarded as nonlinear. Due to the unclear mechanism of brain visual information processing, it is difficult to directly construct a model to characterize such nonlinear relationships; a linearizing feature space is usually introduced to assist the model building [9]. The construction of the feature space is the core of the linearizing encoding model, which determines the encoding performance

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Visual Encoding Model Based on Contrastive Self-Supervised Learning for Human Brain Activity along the Ventral Visual Stream.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Brain Sciences

Lead the way for us

Similar Papers

Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence
Martin Schrimpf ... James J Dicarlo
Neuron | VOL. 108
Martin Schrimpf, et. al.Martin Schrimpf ... James J Dicarlo
11 Sep 2020
Neuron | VOL. 108

Posterior Parietal Cortex Drives Inferotemporal Activations During Three-Dimensional Object Vision.
Ilse C Van Dromme ... Wim Vanduffel
PLOS Biology | VOL. 14
Ilse C Van Dromme, et. al.Ilse C Van Dromme ... Wim Vanduffel
15 Apr 2016
PLOS Biology | VOL. 14

FMRI of shared-stream priming of lexical identification by object semantics along the ventral visual processing stream
Josh Neudorf ... Ron Borowsky
Neuropsychologia | VOL. 133
Josh Neudorf, et. al.Josh Neudorf ... Ron Borowsky
09 Sep 2019
Neuropsychologia | VOL. 133

Su1983 Mild Visceral Stimuli Interfere With Attentional Processes in IBS but Not Healthy Control Subjects
Florian Kurth ... Emeran A Mayer
Gastroenterology | VOL. 142
Florian Kurth, et. al.Florian Kurth ... Emeran A Mayer
18 Apr 2012
Su1983 Mild Visceral Stimuli Interfere With Attentional Processes in IBS but Not Healthy Control Subjects
Florian Kurth ... Emeran A Mayer

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Visual Encoding Model Based on Contrastive Self-Supervised Learning for Human Brain Activity along the Ventral Visual Stream.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Brain Sciences