We effortlessly extract behaviorally relevant information from dynamic visual input in order to understand the actions of others. In the current study, we develop and test a number of models to better understand the neural representational geometries supporting action understanding. Using fMRI, we measured brain activity as participants viewed a diverse set of 90 different video clips depicting social and nonsocial actions in real-world contexts. We developed five behavioral models using arrangement tasks: two models reflecting behavioral judgments of the purpose (transitivity) and the social content (sociality) of the actions depicted in the video stimuli; and three models reflecting behavioral judgments of the visual content (people, objects, and scene) depicted in still frames of the stimuli. We evaluated how well these models predict neural representational geometry and tested them against semantic models based on verb and nonverb embeddings and visual models based on gaze and motion energy. Our results revealed that behavioral judgments of similarity better reflect neural representational geometry than semantic or visual models throughout much of cortex. The sociality and transitivity models in particular captured a large portion of unique variance throughout the action observation network, extending into regions not typically associated with action perception, like ventral temporal cortex. Overall, our findings expand the action observation network and indicate that the social content and purpose of observed actions are predominant in cortical representation.
Read full abstract