The location estimation of multiple simultaneously active sources in an acoustic sensor network is quite challenging because the correct combination of measurements received from different microphone arrays is not usually known, generally termed the data association problem. Existing techniques generally formulate the data association problem as a multidimensional assignment and solve them using classical optimal techniques. Although these approaches show effectiveness for a limited amount of active sources, the computational time and complexity of these techniques increase significantly when the microphones or acoustic sources increase. In this paper, a learning approach is proposed that solves the multidimensional assignment using a deep neural network. Initially, the features that are employed for the association are formulated for each of the detected sources in all the array nodes, and a multidimensional assignment problem is formulated. Subsequently, a deep multidimensional assignment network is devised to extract the correspondence probability of the measurements received from the microphone arrays. Specifically, a data-driven differentiable approach is presented for multidimensional assignments that is computationally efficient. The proposed methodology is validated under realistic conditions for both speech and urban signals. The methodology is compared to state-of-the-art methods for showing the performance gain in terms of accuracy of association and location estimation with a reduced computational time.
Read full abstract