Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection

Feng Li,Masato Akagi

doi:10.1016/j.neucom.2019.04.030

Abstract

In this paper, a novel blind separation method for monaural singing voice based on an extension of robust principal component analysis (RPCA) using a rank-1 constraint called Constraint RPCA (CRPCA) is proposed. Although the conventional RPCA is an effective method to separate singing voice from the mixed audio signal, it fails when one singular value (e.g., drum) is much larger than all others (e.g., other accompanying instruments). The proposed CRPCA method utilizes rank-1 constraint minimization of singular values in RPCA instead of minimizing the nuclear norm, which not only provides a solution robust to large dynamic range differences among instruments but also reduces the computation complexity. Further quality improvement is achieved by converting CRPCA to an ideal binary masking, combining it with harmonic masking to create a coalescent masking, and finally, combining with a vocal activity detection. Evaluation results on ccMixter and DSD100 datasets show that the proposed method achieves better separation performance than the previous methods.

Full Text