In this paper, the authors address the tasks of audio source counting and separation for two-channel instantaneous mixtures. This goal is achieved in two steps. First, a novel scheme is proposed for estimating the number of sources and the corresponding channel intensity difference (CID) values. For this purpose, an angular spectrum is evaluated as a function of the ratio of the magnitude spectrogram of the two channels and the peak locations of that spectrum are obtained. In the second stage, a new approach is developed for extracting the individual source signals exploiting a Bayesian non-parametric modelling. The mean field variational Bayesian approach is applied for inferring the unknown parameters. Classification is then performed on the inferred active CID values to obtain the individual source magnitude spectrograms. This way, the number of spectral components used for modelling each source is found automatically from the data. The Bayesian approach is compared with the standard Kullback-Leibler non-negative tensor factorisation method to illustrate the effectiveness of Bayesian modelling. The performance of the source separation is measured by obtaining the existing metrics for multichannel blind source separation evaluation. The experiments are performed on instantaneous mixtures from the dev2 database.
Read full abstract