Frequency Selection Based Separation of Speech Signals with Reduced Computational Time Using Sparse NMF

Yash Vardhan Varshney,Omar Farooq,Zia Ahmad Abbasi,Musiur Raza Abidi

doi:10.1515/aoa-2017-0031

Yash Vardhan Varshney, Omar Farooq + Show 2 more

Open Access

https://doi.org/10.1515/aoa-2017-0031

Copy DOI

Journal: Archives of Acoustics	Publication Date: Jun 27, 2017
Citations: 9	License type: CC BY-NC-ND 4.0

Affiliation: Aligarh Muslim University

Abstract

Abstract Application of wavelet decomposition is described to speed up the mixed speech signal separation with the help of non-negative matrix factorisation (NMF). It is assumed that the basis vectors of training data of individual speakers had been recorded. In this paper, the spectrogram magnitude of a mixed signal has been factorised with the help of NMF with consideration of sparseness of speech signals. The high frequency components of signal contain very small amount of signal energy. By rejecting the high frequency components, the size of input signal is reduced, which reduces the computational time of matrix factorisation. The signal of lower energy has been separated by using wavelet decomposition. The present work is done for wideband microphone speech signal and standard audio signal from digital video equipment. This shows an improvement in the separation capability using the proposed model as compared with an existing one in terms of correlation between separated and original signals. Obtained signal to distortion ratio (SDR) and signal to interference ratio (SIR) are also larger as compare of the existing model. The proposed model also shows a reduction in computational time, which results in faster operation.

Full Text