Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli.

Thibaud Necciari,Peter Balazs,Sølvi Ystad,Sabine Meunier,Richard Kronland-Martinet,Bernhard Laback,Sophie Savel

doi:10.1371/journal.pone.0166937

Abstract

Many audio applications perform perception-based time-frequency (TF) analysis by decomposing sounds into a set of functions with good TF localization (i.e. with a small essential support in the TF domain) using TF transforms and applying psychoacoustic models of auditory masking to the transform coefficients. To accurately predict masking interactions between coefficients, the TF properties of the model should match those of the transform. This involves having masking data for stimuli with good TF localization. However, little is known about TF masking for mathematically well-localized signals. Most existing masking studies used stimuli that are broad in time and/or frequency and few studies involved TF conditions. Consequently, the present study had two goals. The first was to collect TF masking data for well-localized stimuli in humans. Masker and target were 10-ms Gaussian-shaped sinusoids with a bandwidth of approximately one critical band. The overall pattern of results is qualitatively similar to existing data for long maskers. To facilitate implementation in audio processing algorithms, a dataset provides the measured TF masking function. The second goal was to assess the potential effect of auditory efferents on TF masking using a modeling approach. The temporal window model of masking was used to predict present and existing data in two configurations: (1) with standard model parameters (i.e. without efferents), (2) with cochlear gain reduction to simulate the activation of efferents. The ability of the model to predict the present data was quite good with the standard configuration but highly degraded with gain reduction. Conversely, the ability of the model to predict existing data for long maskers was better with than without gain reduction. Overall, the model predictions suggest that TF masking can be affected by efferent (or other) effects that reduce cochlear gain. Such effects were avoided in the experiment of this study by using maximally-compact stimuli.

Highlights

It is of great interest in audio applications to take human auditory perception into account in the signal processing chain
To obtain a perceptually motivated TF analysis, one can choose a set of atoms whose duration and bandwidth approximate the time and frequency resolution of the human auditory system and/or apply a psychoacoustic model of auditory masking to the coefficients of the transform
Sparsity-based approaches combine TF decompositions and masking models to reduce the amount of nonzero TF coefficients [8, 9]

Summary

Introduction

It is of great interest in audio applications to take human auditory perception into account in the signal processing chain. This generally consists in performing a perceptually motivated time-frequency (TF) analysis of the signal. To obtain a perceptually motivated TF analysis, one can choose a set of atoms whose duration and bandwidth approximate the time and frequency resolution of the human auditory system To reduce the digital size of audio files, audio codecs like mp decompose sounds into TF segments (ideally a transform approximating the auditory frequency resolution is used like in [5]) and apply a masking model to reduce the bit rates in these segments Source separation algorithms estimate binary masks to weight the TF coefficients of sound mixtures based on auditory masking in order to separate the signal(s) of interest [12, 13]

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Nov 22, 2016
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Auditory Time-Frequency Masking: Psychoacoustical Data and Application to Audio Representations
Thibaud Necciari ... Peter Balazs
-
Thibaud Necciari, et. al.Thibaud Necciari ... Peter Balazs
01 Jan 2012
01 Jan 2012

Multi-rolling element faults diagnosis of rolling bearing based on time-frequency analysis and multi-curves extraction
Xiru Liu ... Lixiao Wu
Measurement Science and Technology | VOL. 35
Xiru Liu, et. al.Xiru Liu ... Lixiao Wu
12 Jul 2024
Measurement Science and Technology | VOL. 35

Underdetermined Convolutive Blind Source Separation via Time–Frequency Masking
V.G Reju ... Ing Yann Soon
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18
V.G Reju, et. al.V.G Reju ... Ing Yann Soon
01 Jan 2009
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18

Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation
Motoi Omachi ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 25
Motoi Omachi, et. al.Motoi Omachi ...
01 Mar 2017
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE