Prediction-driven computational auditory scene analysis

Daniel P W Ellis ,Barry Vercoe

doi:10.7916/d84j0n13

Abstract

The sound of a busy environment, such as a city street, gives rise to a perception of numerous distinct events in a human listener--the 'auditory scene analysis' of the acoustic information. Recent advances in the understanding of this process from experimental psychoacoustics have led to several efforts to build a computer model capable of the same function. This work is known as 'computational auditory scene analysis'. The dominant approach to this problem has been as a sequence of modules, the output of one forming the input to the next. Sound is converted to its spectrum, cues are picked out, and representations of the cues are grouped into an abstract description of the initial input. This 'data-driven' approach has some specific weaknesses in comparison to the auditory system: it will interpret a given sound in the same way regardless of its context, and it cannot 'infer' the presence of a sound for which direct evidence is hidden by other components. The 'prediction-driven' approach is presented as an alternative, in which analysis is a process of reconciliation between the observed acoustic features and the predictions of an internal model of the sound-producing entities in the environment. In this way, predicted sound events will form part of the scene interpretation as long as they are consistent with the input sound, regardless of whether direct evidence is found. A blackboard-based implementation of this approach is described which analyzes dense, ambient sound examples into a vocabulary of noise clouds, transient clicks, and a correlogram-based representation of wide-band periodic energy called the weft. The system is assessed through experiments that firstly investigate subjects' perception of distinct events in ambient sound examples, and secondly collect quality judgments for sound events resynthesized by the system. Although rated as far from perfect, there was good agreement between the events detected by the model and by the listeners. In addition, the experimental procedure does not depend on special aspects of the algorithm (other than the generation of resyntheses), and is applicable to the assessment and comparison of other models of human auditory organization. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Prediction-driven computational auditory scene analysis

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An Auditory Scene Analysis Approach to Monaural Speech Segregation
...
-
, et. al. ...
01 Jan 2006
01 Jan 2006

Separation of Speech by Computational Auditory Scene Analysis
Guy J Brown ... Deliang Wang
-
Guy J Brown, et. al.Guy J Brown ... Deliang Wang
01 Jan 2004
01 Jan 2004

Computational Auditory Scene Analysis: Principles, Algorithms and Applications
Chris Darwin
The Journal of the Acoustical Society of America | VOL. 124
Chris DarwinChris Darwin
01 Jul 2008
The Journal of the Acoustical Society of America | VOL. 124

Issues in the use of acoustic cues for auditory scene analysis
Albert S Bregman
The Journal of the Acoustical Society of America | VOL. 113
Albert S BregmanAlbert S Bregman
01 Apr 2003
The Journal of the Acoustical Society of America | VOL. 113

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prediction-driven computational auditory scene analysis

Abstract

Talk to us

Similar Papers