Abstract
Abstract Non-negative matrix factorization (NMF) is a recently well-known method for separating speech from music signal as a single channel source separation problem. In this approach, spectrogram of each source signal is factorized as a multiplication of two matrices known as basis and weight matrices. To obtain a good estimation of signal spectrogram, weight and basis matrices are updated based on a cost function, iteratively. In standard NMF, each frame of signal is considered as an independent observation and this assumption is a drawback for NMF. For overcoming this weakness, a regularization term is added to the cost function to consider spectral temporal continuity. Furthermore, in the standard NMF, the same decomposition rank is usually used for different sources. In this paper, in accompany with using a regularization term, we propose to apply a filter to the signals estimated by NMF. The filter is constructed by signals which are estimated using a regularized NMF method. Moreover, we propose to use different decomposition ranks for speech and music signals as different sources. Experimental results on one hour of speech and music signals show that the proposed method increases signal to inference ratio (SIR) values for speech and music signals in comparison to conventional NMF methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have