Gestalt psychologists in the early part of the century challenged psychophysical notions that perceptual phenomena can be understood from a punctate (‘atomistic’) analysis of the elements present in the stimulus. Their ideas also inhibited later attempts to explain vision in terms of single-unit recordings from individual neurons. A rapprochement between Gestalt phenomenology and physiology seemed unlikely when the first ECVP was held in Marburg, Germany, in 1978. Since that time, response properties of neurons have been discovered that invite an interpretation of visual phenomena (including ‘illusions’) in terms of neuronal processing. Indeed, it is now possible to understand some Gestalt phenomena on the basis of known neurophysiological mechanisms. I begin by outlining the great strides that have been made since the advent of microelectrode recording from single neurons. Initially, cells (‘detectors’) selectively responding to the contrast, spatial frequency, wavelength, orientation, movement, and disparity of a stimulus placed in their receptive fields were used to interpret simple perceptual phenomena (eg, Mach bands, Hermann grids, tilt aftereffect, MAE). In recent years, cells at higher levels of the visual system have been discovered that might explain a number of more complex phenomena: the perception of illusory (occluded) contours by end-stopped cells in area V2, the filling-in of artificial scotomata by neurons in V3, colour constancy by ‘perceptive’ neurons in V4, and the perception of coherent motion in dynamic noise patterns by cells in MT. Studies of flow fields and biological motion in area MST have recently been added to account for our perceptions as we move through our environment. Prompted by these findings, a shift from local to global interactions ‘beyond the classical receptive field’ has taken place in our search for the neural substrates of perception. Current research has focused on three kinds of mechanisms: (i) converging feed-forward projections as the basis for new response properties emerging at higher levels, (ii) recruitment of lateral connections to explain filling-in, and (iii) backward propagation from higher to lower levels to account for binding and figure - ground segregation. How such mechanisms compute large-scale surface properties such as brightness, colour, and depth from local features—indeed how they construct the surfaces themselves from complex natural scenes—is only one of the many questions that are under scrutiny today. Future research will have to tackle the all-important question: How does the analysed information come together again? Furthermore, the contributions of eye movements, attention, learning, other sense modalities, and motor actions will have to be taken into consideration before we arrive at a more complete understanding of visual perception.