RPCA-DRNN technique for monaural singing voice separation

Wen-Hsing Lai,Siou-Lin Wang

doi:10.1186/s13636-022-00236-9

Abstract

In this study, we propose a methodology for separating a singing voice from musical accompaniment in a monaural musical mixture. The proposed method uses robust principal component analysis (RPCA), followed by postprocessing, including median filter, morphology, and high-pass filter, to decompose the mixture. Subsequently, a deep recurrent neural network comprising two jointly optimized parallel-stacked recurrent neural networks (sRNNs) with mask layers and trained on limited data and computation is applied to the decomposed components to optimize the final estimated separated singing voice and background music to further correct misclassified or residual singing and background music in the initial separation. The experimental results of MIR-1K, ccMixter, and MUSDB18 datasets and the comparison with ten existing techniques indicate that the proposed method achieves competitive performance in monaural singing voice separation. On MUSDB18, the proposed method reaches the comparable separation quality in less training data and lower computational cost compared to the other state-of-the-art technique.

Highlights

In a natural environment rich in sound emanating from multiple sources, a target sound reaching our ears is usually mixed with other acoustic interference
The results indicate that the proposed method robust principal component analysis (RPCA)-deep recurrent neural networks (DRNNs) is superior to all of the reference methods in global normalized source-todistortion ratio (SDR) (NSDR) (GNSDR) and global source-toartifact ratio (SAR) (GSAR)
A method combining RPCA and supervised DRNN was employed in an experiment to improve the separation of singing voice from musical accompaniment in monophonic mixtures

Summary

Introduction

In a natural environment rich in sound emanating from multiple sources, a target sound reaching our ears is usually mixed with other acoustic interference. We propose using RPCA based on the underlying low-rank and sparse properties of accompaniments and vocals, respectively, to achieve the initial separation and apply supervised DRNN to limited data to further separate the results of RPCA in order to further correct misclassified or residual singing and background music from the initial separation. The resulting sparse and low-rank matrices obtained after RPCA and postprocessing are sent to their corresponding sRNNs. One sRNN further separates the sparse matrix into the estimated singing and musical accompaniment parts because there is a residual background music component in the initial separated sparse matrix.

Discriminative training

Method

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Feb 5, 2022
Citations: 5	License type: open-access

R Discovery Prime

R Discovery Prime

RPCA-DRNN technique for monaural singing voice separation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

Separation of singing voice from music accompaniment using matrix factorization method
Harshada Burute ... P B Mane
-
Harshada Burute, et. al.Harshada Burute ... P B Mane
01 Oct 2015
01 Oct 2015

Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking
Feng Li ... Masato Akagi
-
Feng Li, et. al.Feng Li ... Masato Akagi
01 Nov 2019
01 Nov 2019

Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation
Yukara Ikemiya ... Katsutoshi Itoyama
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24
Yukara Ikemiya, et. al.Yukara Ikemiya ... Katsutoshi Itoyama
01 Nov 2016
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24

Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection
Feng Li ... Masato Akagi
Neurocomputing | VOL. 350
Feng Li, et. al.Feng Li ... Masato Akagi
17 Apr 2019
Neurocomputing | VOL. 350

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RPCA-DRNN technique for monaural singing voice separation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing