Abstract

Non-negative matrix factorization (NMF) is attracting a lot of attention as a powerful technique for music transcription and audio source separation. With this approach, the magnitude (or power) spectrogram of a mixed signal, interpreted as non-negative matrix Y, is factorized into the product of two non-negative matrices, dictionary matrix H and activation matrix U. Each template vector in the dictionary matrix corresponds to the prototype spectrum of a certain sound component. So that NMF can output a musically meaningful as well as accurate decomposition, we must extend the NMF model HU with reasonable assumptions. One such assumption involves the temporal regularity underlying the onset occurrences of musical notes and drum sounds. In particular, the periodicity is most apparent in the onset timings of bass instruments and drum sounds. Motivated by this fact, this paper proposes a new constrained NMF that appends the objective function of NMF a criterion that promotes the periodicity of the time-varying amplitude associated with each basis spectrum and derives an iterative algorithm for solving the regularized optimization problem of interest. The proposed method is particularly noteworthy in that it makes it possible to extract audio events that occur periodically in an unsupervised manner. Unsupervised and supervised audio source separation experiments show that the proposed method significantly outperforms conventional approaches including original NMF and periodicity-aware music/voice separation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call