Abstract

Nonnegative matrix factorization (NMF) is one of a technique for audio source separation. It separates input audio signal into a set of basis spectra and a set of intensity trajectories of the basis spectra. We utilize this technique to extract particular musical instrument sound from musical audio signal. However, the resultant separation by NMF is known to be unrelated to instrumentation of the audio signal. Therefore, clustering technique by using a spectral distance measure is applied to the basis spectra so that the clustered signal corresponds to the audio signal of particular musical instrument. In this study, we investigate the optimal number of bases and the optimal distance measure for clustering. Experiments using audio signals that include two musical instrument sounds and employing mel-cepstrum, linear predictive coding (LPC) cepstrum and LPC mel-cepstrum as a distance measure, showed that LPC mel-cepstrum tends to classify more properly than the others do. This tendency is caused by the fact that the resultant spectrum obtained by LPC mel-cepstrum keeps some important peaks to classify musical instrument sounds properly. Future work includes integrating LPC mel-cepstrum into a distance measure used in NMF algorithm inside to fit with extraction of particular musical instrument sound.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call