Abstract

In order to improve the performance of singer identification, we propose a system to separate singing voice from music accompaniment for monaural recordings. Our system consists of two key stages. The first stage exploits the nonnegative matrix partial co-factorization (NMPCF), which is a joint matrix decomposition integrating prior knowledge of singing voice and pure accompaniment to separate the mixture signal into singing voice portion and accompaniment portion. In the second stage, based on the separated singing voice obtained by the first stage, the pitches of singing voice are first estimated and then the harmonic components of singing voice can be distinguished. For a frame, the distinguished harmonic components are regarded as reliable while other frequency components unreliable, thus the spectrum is incomplete. With those harmonic components, the complete spectrums of singing voice can be reconstructed by a missing feature method, spectrum reconstruction, obtaining a refined signal with more clean singing voice. Experimental results demonstrate that, from the point view of source separation, the singing voice refinement can further improve ΔSNR in contrast with the singing voice separation using NMPCF, while for the point view of singer identification, the singing voice separated by NMPCF is more appropriate than the refined singing voice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call