Generative visual common sense: Testing analysis-by-synthesis on Mondrian-style image.

Ning Tang,Tao Gao,Jifan Zhou,Siyi Gong,Mowei Shen

doi:10.1037/xge0001413

Ning Tang, Tao Gao + Show 3 more

https://doi.org/10.1037/xge0001413

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The well-known Mondrian-style images, aside from being aesthetically amusing, also reflect the core principles of human vision in their viewing experience. First, when we see a Mondrian-style image consisting only of a grid and primary colors, we may automatically interpret its causal history such that it was generated by recursively partitioning a blank scene. Second, the image we observe is open to many possible ways of partitioning, and their probabilities of dominating the interpretation can be captured by a probabilistic distribution. Moreover, the causal interpretation of a Mondrian-style image can emerge almost spontaneously, not being tailored to any specific task. Using Mondrian-style images as a case study, we demonstrate the generative nature of human vision by showing that a Bayesian model based upon an image-generation task can support a wide range of visual tasks with little retraining. Our model, learned from human-synthesized Mondrian-style images, could predict human performance in the perceptual complexity ranking, capture the transmission stability when images were iteratively passed among participants, and pass a visual Turing test. Our results collectively show that human vision is causal such that we interpret an image from the angle of how it was generated. The success of generalization with little retraining suggests that generative vision constitutes a type of common sense that supports a wide range of tasks of different natures. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

Full Text