Abstract

One of the most popular techniques for Acoustic Source Localization is the Generalized Cross Correlation (GCC) and its use in Steered Response Power techniques (SRP). Nowadays, Deep Learning strategies may outperform these classical methods, but they are generally dependent on the room and sensor geometric configuration that are used during the training phases. Hence, they require adaptation and re-training when facing a new environment, which is a problem in practice as re-training requires labelling new data and running a complex training algorithm. In this work we use a Convolutional Deep Neural Network that transforms the GCC between two signals into a Gaussian shaped signal, that we call Deep Generalized Cross Correlation (DeepGCC). We combine DeepGCC estimations to create a 3D acoustic map, similarly to SRP techniques. This acoustic map can be further refined using a sparse generative model to recover the source position. Crucially, we can adapt the acoustic map to different microphone array geometries without retraining the DeepGCC network. We show that our method outperforms both classical approaches and recent Deep Learning strategies in real and simulated challenging scenarios with mismatched training-testing conditions, not requiring re-training with different sensor configurations or room environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call