Abstract

This paper proposes a method of monaural sound source separation by clustering based on the similarity of basis vectors decomposed by Non-negative Matrix Factorization (NMF). In the proposed method, the basis vectors are clustered on the assumption that the similarity between the basis vectors constituting the target sound source is higher than the similarity with the basis vectors of the other sound sources. Hierarchical clustering, which forms clusters in descending order of feature similarity, is introduced. Since it is unnecessary to explicitly determine the number of clusters in hierarchical clustering, hierarchical clustering can be classified into an optional number of clusters according to the threshold. Therefore, the proposed method can separate to an optional number of sound sources. From the numerical evaluation result, it was found that the Signal to Distortion Ratio (SDR), which is an evaluation index of sound source separation, can be improved by approximately 6 to 10 dB. Undesirable cases in which most of the basis vectors are classified into the same cluster are also discussed. In addition, sound source separation with mixed three mixed sound sources was also evaluated, and it was confirmed that SDR can be improved by about 10 dB.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call