Mid-level Representation Research Articles

Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.

Read full abstract

Mid-level visual features directly support an array of behaviors; thus, they may be critical for understanding the functional organization of visual cortex. However, attempts at characterizing mid-level features have been hampered by the difficulty of describing these features in words—they exist in an “ineffable valley” between the describable patterns of low-level vision (e.g., edges) and the commonsense concepts of visual cognition (e.g., objects). Here we developed a novel approach to identify interpretable emergent properties of mid-level representations in deep neural network (DNN) models of visual cortex. Using this approach, we examined DNN models that were fit to scene-evoked fMRI responses in category-selective regions of visual cortex—specifically, scene-selective cortex (sceneDNN) and object-selective cortex (objectDNN). Our method uses a semantically-guided image-occlusion procedure to systematically characterize how DNN activations are driven by the classes of objects within a scene. We examined the relationship between mid-level features and several object properties that have previously been associated with response preferences in visual cortex: curvature, real-world size, animacy, naturalness, and spatial stability. We found that while mid-level features appear complex and difficult to describe at a surface level, large-scale computational analyses can reveal a latent underlying relationship to interpretable object properties. Specifically, we found that the mid-level representations of the sceneDNN support a latent preference for objects that are boxy and large in real-world size. In contrast, mid-level representations of the objectDNN support a complementary preference for objects that are curvy and small in real-world size. These effects were robust to variations of model hyperparameters and were reproducible across different DNN models. Our findings show that curvature and real-world size are emergent organizing principles of mid-level visual representation, and they suggest that differences in mid-level feature tuning may be critical for understanding the organization of visual cortex into category-selective patches.

Read full abstract

Mid-level Representation Research Articles

Related Topics

Articles published on Mid-level Representation

Learning Invariant Inter-pixel Correlations for Superpixel Generation

Decoding face recognition abilities in the human brain.

FERMixNet: An Occlusion Robust Facial Expression Recognition Model with Facial Mixing Augmentation and Mid-Level Representation Learning

Depth- and semantics-aware multi-modal domain translation: Generating 3D panoramic color images from LiDAR point clouds

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation

A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex.

Self-information of radicals: A new clue for zero-shot Chinese character recognition

Joint multimodal sentiment analysis based on information relevance

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Image-Based Navigation in Real-World Environments Via Multiple Mid-Level Representations: Fusion Models, Benchmark and Efficient Evaluation

Segment-based bag of visual words model for urban land cover mapping using polarimetric SAR data

A Modified HSIFT Descriptor for Medical Image Classification of Anatomy Objects

Deep neural network models of visual cortex reveal curvature and real-world size as organizing principles of mid-level representation

DeepFlux for Skeleton Detection in the Wild

CTC-Based Learning of Chroma Features for Score–Audio Music Retrieval

Cogmic space for narrative-based world representation

Visual sentiment analysis via deep multiple clustered instance learning

Learning Contour-Based Mid-Level Representation for Shape Classification

SSNet: Learning Mid-Level Image Representation Using Salient Superpixel Network

Heterogeneous Transfer Learning for Hyperspectral Image Classification Based on Convolutional Neural Network

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Mid-level Representation Research Articles

Related Topics

Articles published on Mid-level Representation

Learning Invariant Inter-pixel Correlations for Superpixel Generation

Decoding face recognition abilities in the human brain.

FERMixNet: An Occlusion Robust Facial Expression Recognition Model with Facial Mixing Augmentation and Mid-Level Representation Learning

Depth- and semantics-aware multi-modal domain translation: Generating 3D panoramic color images from LiDAR point clouds

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation

A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex.

Self-information of radicals: A new clue for zero-shot Chinese character recognition

Joint multimodal sentiment analysis based on information relevance

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Image-Based Navigation in Real-World Environments Via Multiple Mid-Level Representations: Fusion Models, Benchmark and Efficient Evaluation

Segment-based bag of visual words model for urban land cover mapping using polarimetric SAR data

A Modified HSIFT Descriptor for Medical Image Classification of Anatomy Objects

Deep neural network models of visual cortex reveal curvature and real-world size as organizing principles of mid-level representation

DeepFlux for Skeleton Detection in the Wild

CTC-Based Learning of Chroma Features for Score–Audio Music Retrieval

Cogmic space for narrative-based world representation

Visual sentiment analysis via deep multiple clustered instance learning

Learning Contour-Based Mid-Level Representation for Shape Classification

SSNet: Learning Mid-Level Image Representation Using Salient Superpixel Network

Heterogeneous Transfer Learning for Hyperspectral Image Classification Based on Convolutional Neural Network