Abstract

This paper proposes a novel solution for separating an unknown and time-varying number of moving acoustic sources in a blind setting using multiple microphone arrays. A standard steered-response power phase transform method is applied to extract source position measurements, which inevitably contain noise, false detections, missed detections, and are not labeled with the source identities. The imperfect measurements lead to the space-time permutation problem, as there is no information on how the measurements are associated to the sources in space, nor how the measurements are connected across time, if at all. To solve this problem, a labeled random finite set tracking framework is adopted to jointly estimate the source positions and their labels or identities. Based on these trajectory estimates, a corresponding set of time-varying generalized side-lobe cancellers is constructed to perform source separation. The overall algorithm operates in a block-wise or an online fashion and is scalable with the number of microphone arrays. The quality of the measurements, tracking, and separation, are evaluated respectively, with the OSPA metric, OSPA <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(2)</sup> metric, and ITU-T P.835 based listening tests, on both real-world and simulated data.

Highlights

  • I N microphone array processing, blind source separation (BSS) is the estimation of source signals, using only the received mixture signals with no information about the original sources and the mixing process [1]

  • Source position measurements obtained through Steered-Response Power Phase Transform (SRP-PHAT) [25] exhibit the space-time permutation issue, where it is not known which measurement is connected to which source at the current time, nor which measurements are connected to the same source across time

  • This paper proposes a block-wise or online solution for blind source separation with multiple microphone arrays, which can accommodate an unknown time-varying number of acoustic moving sources in mild reverberation

Read more

Summary

INTRODUCTION

I N microphone array processing, blind source separation (BSS) is the estimation of source signals, using only the received mixture signals with no information about the original sources and the mixing process [1]. Subsequent RFS-based solutions have been proposed for multi-source acoustic tracking with the Probability Hypothesis Density (PHD) filter [8], [19], [20], [21], the Cardinalized PHD filter [22], the Cardinality-Balanced Multi-Target MultiBernoulli filter [23], and the RFS Particle Filter [24] These above methods do not directly estimate source tracks, which are source position estimates associated with a common label. This work is the first to formally address the space-time permutation problem, using a labeled random finite set (RFS) approach [26], [27], [28] to jointly estimate the number of sources, their positions and their labels. We evaluate the separation performance via subjective listening tests according to the ITU-T P.835 methodology [33]

PROBLEM FORMULATION AND SOLUTION OVERVIEW
Overview of the Proposed Method
SIGNAL PRE-PROCESSING
Multi-Source Bayesian Tracking Filter
The Multi-Source Transition Model
The Multi-Array Measurement Likelihood Model
SOURCE SEPARATION
Spatial Filtering
Post-processing
EXPERIMENTS
Experimental Setup
Parameters Breakdown
Evaluation of SRP-PHAT Multi-Array Measurements
Evaluation of Multi-Source Tracking Filter
Evaluation of Source Separation
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call