Audio Source Separation Based on Residual Reprojection

Choongsang Cho,Je Woo Kim,Sangkeun Lee

doi:10.4218/etrij.15.0114.1311

Abstract

This paper describes an audio source separation that is based on nonnegative matrix factorization (NMF) and expectation maximization (EM). For stable and high-performance separation, an effective auxiliary source separation that extracts source residuals and reprojects them onto proper sources is proposed by taking into account an ambiguous region among sources and a source's refinement. Specifically, an additional NMF (model) is designed for the ambiguous region — whose elements are not easily represented by any existing or predefined NMFs of the sources. The residual signal can be extracted by inserting the aforementioned model into the NMF-EM-based audio separation. Then, it is refined by the weighted parameters of the separation and reprojected onto the separated sources. Experimental results demonstrate that the proposed scheme (outlined above) is more stable and outperforms existing algorithms by, on average, 4.4 dB in terms of the source distortion ratio.

Full Text