Abstract

The concept of saliency describes how relevant a stimulus is for humans. This phenomenon has been studied under different perspectives and modalities, such as audio, visual, or both. It has been employed in intelligent systems to interact with their environment in an attempt to emulate or even outperform human behavior in tasks, such as surveillance and alarm systems or even robotics. In this paper, we focus on the aural modality and our goal consists in measuring the robustness of Echoic log-surprise in comparison with a set of auditory saliency techniques when tested on noisy environments for the task of saliency detection. The acoustic saliency methods that we have analyzed include Kalinli’s saliency model, Bayesian log-surprise, and our proposed algorithm, Echoic log-surprise. This last method combines an unsupervised approach based on the Bayesian log-surprise and the biological concept of echoic or auditory sensory memory by means of a statistical fusion scheme, where the use of different distance metrics or statistical divergences, such as Renyi’s or Jensen-Shannon’s among others, are considered. Additionally, for comparison purposes, we have also compared some classical onset detection techniques, such as those based on voice activity detection or energy thresholding. Results show that Echoic log-surprise outperforms the detection capabilities of the rest of the techniques analyzed in this paper under a great variety of noises and signal-to-noise ratios, corroborating its robustness in noisy environments. In particular, our algorithm with the Jensen-Shannon fusion scheme produces the best F-scores. With the aim of better understanding the behavior of Echoic log-surprise, we have also studied the influence of its control parameters, depth and memory, and their influence at different noise levels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.