Abstract
We introduce a novel algorithm for online estimation of Acoustic Impulse Responses (AIRs) which allows for fast convergence by exploiting prior knowledge about the fundamental structure of AIRs. The proposed method assumes that the variability of AIRs of an acoustic scene is confined to a low-dimensional manifold which is embedded in a high-dimensional space of possible AIR estimates. We discuss various approaches which exploit a training data set of AIRs, e.g., high-accuracy AIR estimates from the acoustic scene, to learn a local affine subspace approximation of the AIR manifold. The model is motivated by the idea of describing the generally nonlinear AIR manifold locally by tangential hyperplanes and its validity is verified for simulated data. Subsequently, we describe how the manifold assumption can be used to enhance online AIR estimates by projecting them onto an adaptively estimated subspace. Motivated by the assumption of manifolds being locally Euclidean, the parameters determining the adaptive subspace are learned from the nearest neighbor AIR training samples to the current AIR estimate. To assess the proximity of training data AIRs to the current AIR estimate, we introduce a probabilistic extension of the Euclidean distance which improves the performance for applications with non-white excitation signals. Furthermore, we describe how model imperfections can be tackled by a soft projection of the AIR estimates. The proposed algorithm exhibits significantly faster convergence properties in comparison to a high-performance state-of-the-art algorithm. Furthermore, we show an improved steady-state performance for speech-excited system identification scenarios suffering from high-level interfering noise and nonunique solutions.
Highlights
Both convergence speed and noise-robustness are usually addressed by adaptive step size-controlled Adaptive Filter (AF) algorithms [2]
In contrast to state-of-the-art approaches the subspace parameters are inferred online from the K-Nearest Neighbor (KNN) Acoustic Impulse Responses (AIRs) training samples to the current AIR estimate which allows for an improved modeling of realistic acoustic scenes
We start by introducing the first-order Markov model assumption which is commonly used in Kalman Filter (KF)-based system identification algorithms and discuss its limitations
Summary
The continuously increasing amount of acoustic communication devices has fueled the research on reliable speech enhancement algorithms. In contrast to state-of-the-art approaches the subspace parameters are inferred online from the K-Nearest Neighbor (KNN) AIR training samples to the current AIR estimate which allows for an improved modeling of realistic acoustic scenes. To assess this proximity we propose a novel probabilistically motivated distance measure which takes into account the convergence state of the adaptive filter. It is shown that the proposed method improves the convergence speed of KF-based system identification algorithms and achieves higher steady-state performance in scenarios suffering from high-level interfering noise.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.