A previous modelling study reported that spectro-temporal cues perceptually relevant to humans provide enough information to accurately classify "natural soundscapes" recorded in four distinct temperate habitats of a biosphere reserve [Thoret, Varnet, Boubenec, Ferriere, Le Tourneau, Krause, and Lorenzi (2020). J. Acoust. Soc. Am. 147, 3260]. The goal of the present study was to assess this prediction for humans using 2 s samples taken from the same soundscape recordings. Thirty-one listeners were asked to discriminate these recordings based on differences in habitat, season, or period of the day using an oddity task. Listeners' performance was well above chance, demonstrating effective processing of these differences and suggesting a general high sensitivity for natural soundscape discrimination. This performance did not improve with training up to 10 h. Additional results obtained for habitat discrimination indicate that temporal cues play only a minor role; instead, listeners appear to base their decisions primarily on gross spectral cues related to biological sound sources and habitat acoustics. Convolutional neural networks were trained to perform a similar task using spectro-temporal cues extracted by an auditory model as input. The results are consistent with the idea that humans exclude the available temporal information when discriminating short samples of habitats, implying a form of a sub-optimality.
Read full abstract