Abstract

This paper addresses the separation of audio sourcesfrom convolutive mixtures captured by a microphone array. Weapproach the problem using complex-valued non-negative matrixfactorization (CNMF), and extend previous works by tailoringadvanced (single-channel) NMF models, such as the deconvolutiveNMF, to the multichannel factorization setup. Further, a sparsity-promoting scheme is proposed so that the underlying estimatedparameters better fit the time-frequency properties inherentin some audio sources. The proposed parameter estimationframework is compatible with previous related works, and can bethought of as a step toward a more general method. We evaluatethe resulting separation accuracy using a simulated acousticscenario, and the tests confirm that the proposed algorithmprovides superior separation quality when compared to a state-of-the-art benchmark. Finally, an analysis on the effects of theintroduced regularization term shows that the solution is in factsteered toward a sparser representation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call