Characterising and dissecting human perception of scene complexity

Cameron Kyle-Davidson,Elizabeth Yue Zhou,Dirk B Walther,Adrian G Bors,Karla K Evans

doi:10.1016/j.cognition.2022.105319

Cameron Kyle-Davidson, Elizabeth Yue Zhou + Show 3 more

Open Access

https://doi.org/10.1016/j.cognition.2022.105319

Copy DOI

Abstract

Humans can effortlessly assess the complexity of the visual stimuli they encounter. However, our understanding of how we do this, and the relevant factors that result in our perception of scene complexity remain unclear; especially for the natural scenes in which we are constantly immersed. We introduce several new datasets to further understanding of human perception of scene complexity. Our first dataset (VISC-C) contains 800 scenes and 800 corresponding two-dimensional complexity annotations gathered from human observers, allowing exploration for how complexity perception varies across a scene. Our second dataset, (VISC-CI) consists of inverted scenes (reflection on the horizontal axis) with corresponding complexity maps, collected from human observers. Inverting images in this fashion is associated with destruction of semantic scene characteristics when viewed by humans, and hence allows analysis of the impact of semantics on perceptual complexity. We analysed perceptual complexity from both a single-score and a two-dimensional perspective, by evaluating a set of calculable and observable perceptual features based upon grounded psychological research (clutter, symmetry, entropy and openness). We considered these factors’ relationship to complexity via hierarchical regressions analyses, tested the efficacy of various neural models against our datasets, and validated our perceptual features against a large and varied complexity dataset consisting of nearly 5000 images. Our results indicate that both global image properties and semantic features are important for complexity perception. We further verified this by combining identified perceptual features with the output of a neural network predictor capable of extracting semantics, and found that we could increase the amount of explained human variance in complexity beyond that of low-level measures alone. Finally, we dissect our best performing prediction network, determining that artificial neurons learn to extract both global image properties and semantic details from scenes for complexity prediction. Based on our experimental results, we propose the “dual information” framework of complexity perception, hypothesising that humans rely on both low-level image features and high-level semantic content to evaluate the complexity of images.

Full Text