Abstract

Abstract Application of wavelet decomposition is described to speed up the mixed speech signal separation with the help of non-negative matrix factorisation (NMF). It is assumed that the basis vectors of training data of individual speakers had been recorded. In this paper, the spectrogram magnitude of a mixed signal has been factorised with the help of NMF with consideration of sparseness of speech signals. The high frequency components of signal contain very small amount of signal energy. By rejecting the high frequency components, the size of input signal is reduced, which reduces the computational time of matrix factorisation. The signal of lower energy has been separated by using wavelet decomposition. The present work is done for wideband microphone speech signal and standard audio signal from digital video equipment. This shows an improvement in the separation capability using the proposed model as compared with an existing one in terms of correlation between separated and original signals. Obtained signal to distortion ratio (SDR) and signal to interference ratio (SIR) are also larger as compare of the existing model. The proposed model also shows a reduction in computational time, which results in faster operation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call