Abstract

This paper concerns the problem of convolutive sound source separation from mutlichannel recordings made with a spherical microphone array. In particular, we formulate two state-of-the-art separation techniques based on Expectation Maximization (EM) and Nonnegative Tensor Factorization (NTF) in the spherical harmonic domain (SHD). Furthermore, we adjust and incorporate the Gaussian Localization Prior (GLP) to the proposed algorithms, which yields two variants of the derived methods. For the source signal reconstruction, a Minimum Variance Distortionless Response (MVDR) beamformer with a single-channel Wiener post-filter is employed. The performance comparison is based on experimental evaluation using micro-phone signals simulated with the image-source method in several scenarios, including diverse geometrical setup, different number of sources and various types of source signals, namely the recordings of speech utterances and musical instruments. The experimental results for the first-order ambisonic signals show that the proposed methods enable high-quality sound source separation in the spherical harmonic domain. In particular, we show that incorporation of the Gaussian Localization Prior to the proposed algorithms leads to a substantial improvement in separation performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call