Abstract

We introduce a novel algorithm for online estimation of Acoustic Impulse Responses (AIRs) which allows for fast convergence by exploiting prior knowledge about the fundamental structure of AIRs. The proposed method assumes that the variability of AIRs of an acoustic scene is confined to a low-dimensional manifold which is embedded in a high-dimensional space of possible AIR estimates. We discuss various approaches which exploit a training data set of AIRs, e.g., high-accuracy AIR estimates from the acoustic scene, to learn a local affine subspace approximation of the AIR manifold. The model is motivated by the idea of describing the generally nonlinear AIR manifold locally by tangential hyperplanes and its validity is verified for simulated data. Subsequently, we describe how the manifold assumption can be used to enhance online AIR estimates by projecting them onto an adaptively estimated subspace. Motivated by the assumption of manifolds being locally Euclidean, the parameters determining the adaptive subspace are learned from the nearest neighbor AIR training samples to the current AIR estimate. To assess the proximity of training data AIRs to the current AIR estimate, we introduce a probabilistic extension of the Euclidean distance which improves the performance for applications with non-white excitation signals. Furthermore, we describe how model imperfections can be tackled by a soft projection of the AIR estimates. The proposed algorithm exhibits significantly faster convergence properties in comparison to a high-performance state-of-the-art algorithm. Furthermore, we show an improved steady-state performance for speech-excited system identification scenarios suffering from high-level interfering noise and nonunique solutions.

Highlights

  • Both convergence speed and noise-robustness are usually addressed by adaptive step size-controlled Adaptive Filter (AF) algorithms [2]

  • In contrast to state-of-the-art approaches the subspace parameters are inferred online from the K-Nearest Neighbor (KNN) Acoustic Impulse Responses (AIRs) training samples to the current AIR estimate which allows for an improved modeling of realistic acoustic scenes

  • We start by introducing the first-order Markov model assumption which is commonly used in Kalman Filter (KF)-based system identification algorithms and discuss its limitations

Read more

Summary

Introduction

The continuously increasing amount of acoustic communication devices has fueled the research on reliable speech enhancement algorithms. In contrast to state-of-the-art approaches the subspace parameters are inferred online from the K-Nearest Neighbor (KNN) AIR training samples to the current AIR estimate which allows for an improved modeling of realistic acoustic scenes. To assess this proximity we propose a novel probabilistically motivated distance measure which takes into account the convergence state of the adaptive filter. It is shown that the proposed method improves the convergence speed of KF-based system identification algorithms and achieves higher steady-state performance in scenarios suffering from high-level interfering noise.

Probabilistic Signal Model
Analysis of Acoustic Impulse Responses
Acoustic Impulse Response Manifold
Affine Subspace Model
Affine Subspace Parameter Estimation
Local Training Data Estimation
Acoustic Impulse Response Denoising
Kalman Filter‐based Acoustic Impulse Response Estimation
Adaptive Subspace Tracking
Soft Subspace Projection
Algorithmic Description
Experiments
Findings
Summary and Outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call