Abstract

Nonnegative matrix factorization (NMF) has shown good performances on blind audio source separation (BASS). While the NMF analysis is a non-convex optimization problem when both the basis and encoding matrices need to be estimated simultaneously, the source separation step of the NMF-based BASS with a fixed basis matrix has been considered convex. However, because the basis matrix for the BASS is typically constructed by concatenating the basis matrices trained with individual source signals, the subspace spanned by the basis vectors for one source may overlap with that for other sources. In this paper, we have shown that the resulting encoding vector is not unique when the subspaces spanned by basis vectors for the sources overlap, which implies that the initialization of the encoding vector in the source separation stage is not trivial. Furthermore, we propose a novel method to initialize the encoding vector for the separation step based on the prior model of the encoding vector. Experimental results showed that the proposed method outperformed the uniform random initialization by 1.09 and 2.21dB in the source-to-distortion ratio, and 0.20 and 0.23 in PESQ scores for supervised and semi-supervised cases, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call