Numerosity tuning in human association cortices and local image contrast representations in early visual cortex

Jacob M Paul,Tuomas C Ten Cate,Ben M Harvey,Martijn Van Ackooij

doi:10.1038/s41467-022-29030-z

Abstract

Human early visual cortex response amplitudes monotonically increase with numerosity (object number), regardless of object size and spacing. However, numerosity is typically considered a high-level visual or cognitive feature, while early visual responses follow image contrast in the spatial frequency domain. We find that, at fixed contrast, aggregate Fourier power (at all orientations and spatial frequencies) follows numerosity closely but nonlinearly with little effect of object size, spacing or shape. This would allow straightforward numerosity estimation from spatial frequency domain image representations. Using 7T fMRI, we show monotonic responses originate in primary visual cortex (V1) at the stimulus’s retinotopic location. Responses here and in neural network models follow aggregate Fourier power more closely than numerosity. Truly numerosity tuned responses emerge after lateral occipital cortex and are independent of retinotopic location. We propose numerosity’s straightforward perception and neural responses may result from the pervasive spatial frequency analyses of early visual processing.

Highlights

Human early visual cortex response amplitudes monotonically increase with numerosity, regardless of object size and spacing
We found monotonic increases in neural population response amplitude with increasing numerosity in the retinotopic locations of our stimuli, beginning in V1
While these monotonic responses follow numerosity closely, they are better predicted by aggregate Fourier power, which follows numerosity closely over a wide range of stimulus parameters for a fixed contrast

Summary

Introduction

Human early visual cortex response amplitudes monotonically increase with numerosity (object number), regardless of object size and spacing. Other fMRI studies using multivoxel pattern analyses[7,8] and representational similarity analyses[9] support the existence of numerosity-tuned neural populations in the human brain Response properties of these neurons mirror properties of numerosity perception[3,6,10]. It has since been shown that monotonic[22] and tuned[23] responses to numerosity emerge in a probabilistic hierarchical generative network trained only to efficiently encode the image and maximize the likelihood of reconstructing the image, and even in a randomly weighted network In this model, the first stage decomposes the image using spatial receptive fields with surround suppression, as in the early visual system. The resulting monotonic responses to numerosity are spatially selective[22], but responses are almost invariant to item size and spacing without the need for explicit object individuation or size normalization[19] Another class of neural network model, deep convolutional neural networks, show monotonic and numerosity-tuned units, even in randomly weighted networks[24]. Monotonic units emerge early in the network and feed into numerosity-tuned units, where different weights on these inputs give numerosity-tuned responses with different numerosity preferences

Methods

Results

Conclusion