Detecting and selecting sound events is emerging as an interesting technique for characterizing and representing the sound environment of a specific location. In this article we propose a computational model for automatically constructing a so-called acoustic summary, i.e. a comprehensive collection of sounds aiming to represent the specific sound environment at a given location. Such an acoustic summary could be used by architects, soundscape designers, and urban planners to explore – by listening – the sonic environment at a certain location as it is perceived by a human listener. The model is based on a self-organizing map, a type of neural network. It starts by extracting several psychoacoustic features from the sound. A specific, extensive and unsupervised training allows this map to be tuned to the typical sounds that are likely to be heard at the microphone location. The learning algorithm takes into account some basic aspects of human perception. For example, salient events tend to be better remembered than the ones that do not stand out, even if they occur less frequently. After the training, the self-organizing map is used to form an exhaustive acoustic summary by means of automatically recording specific sound events for the microphone location. In addition to describing the proposed tool, this paper also presents a validation test with local residents in order to show the ability of the model to pick up sounds which bring out the distinctiveness and the specificity of the soundscape as a local resident would do.