A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications

Cees H Taal,Richard Heusdens,Richard C Hendriks

doi:10.1109/tasl.2012.2184753

Abstract

Perceptual models exploiting auditory masking are frequently used in audio and speech processing applications like coding and watermarking. In most cases, these models only take into account spectral masking in short-time frames. As a consequence, undesired audible artifacts in the temporal domain may be introduced (e.g., pre-echoes). In this article we present a new low-complexity spectro-temporal distortion measure. The model facilitates the computation of analytic expressions for masking thresholds, while advanced spectro-temporal models typically need computationally demanding adaptive procedures to find an estimate of these masking thresholds. We show that the proposed method gives similar masking predictions as an advanced spectro-temporal model with only a fraction of its computational power. The proposed method is also compared with a spectral-only model by means of a listening test. From this test it can be concluded that for non-stationary frames the spectral model underestimates the audibility of introduced errors and therefore overestimates the masking curve. As a consequence, the system of interest incorrectly assumes that errors are masked in a particular frame, which leads to audible artifacts. This is not the case with the proposed method which correctly detects the errors made in the temporal structure of the signal.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2012
Citations: 48

Similar Papers

Conditional Random Fields in Speech, Audio, and Language Processing
Eric Fosler-Lussier ... Preethi Jyothi
Proceedings of the IEEE | VOL. 101
Eric Fosler-Lussier, et. al.Eric Fosler-Lussier ... Preethi Jyothi
01 Apr 2013
Proceedings of the IEEE | VOL. 101

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes
Anton Ratnarajah ... Dinesh Manocha
-
Anton Ratnarajah, et. al.Anton Ratnarajah ... Dinesh Manocha
10 Oct 2022
10 Oct 2022

A C++ research and development environment for speech and audio processing applications
A Erdem Ertan ... T.P Barnwell
-
A Erdem Ertan, et. al.A Erdem Ertan ... T.P Barnwell
29 Oct 2000
29 Oct 2000

Voice Activity Detection
Tom Bäckström
-
Tom BäckströmTom Bäckström
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing