A Skip Attention Mechanism for Monaural Singing Voice Separation

Weitao Yuan,Wenwu Wang,Masashi Unoki,Xiangrui Li,Shengbei Wang

doi:10.1109/lsp.2019.2935867

Abstract

This work proposes a simple but effective attention mechanism, namely Skip Attention (SA), for monaural singing voice separation (MSVS). First, the SA, embedded in the convolutional encoder-decoder network (CEDN), realizes an attention-driven and dependency modeling for the repetitive structures of the music source. Second, the SA, replacing the popular skip connection in the CEDN, effectively controls the flow of the low-level (vocal and musical) features to the output and improves the feature sensitivity and accuracy for MSVS. Finally, we implement the proposed SA on the Stacked Hourglass Network (SHN), namely Skip Attention SHN (SA-SHN). Quantitative and qualitative evaluation results have shown that the proposed SA-SHN achieves significant performance improvement on the MIR-1K dataset (compared to the state-of-the-art SHN) and competitive MSVS performance on the DSD100 dataset (compared to the state-of-the-art DenseNet), even without using any data augmentation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Skip Attention Mechanism for Monaural Singing Voice Separation

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters

Lead the way for us

Journal: IEEE Signal Processing Letters	Publication Date: Oct 1, 2019
Citations: 51

Similar Papers

Enhanced feature network for monaural singing voice separation
Weitao Yuan ... Masashi Unoki
Speech Communication | VOL. 106
Weitao Yuan, et. al.Weitao Yuan ... Masashi Unoki
19 Nov 2018
Speech Communication | VOL. 106

On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset
Chao-Ling Hsu ... J.-S.R Jang
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18
Chao-Ling Hsu, et. al. Chao-Ling Hsu ... J.-S.R Jang
01 Feb 2010
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18

Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking
Feng Li ... Masato Akagi
-
Feng Li, et. al.Feng Li ... Masato Akagi
01 Nov 2019
01 Nov 2019

Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition
Salar Jafarlou ... Vinay Kothapally
-
Salar Jafarlou, et. al.Salar Jafarlou ... Vinay Kothapally
01 Dec 2019
01 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Skip Attention Mechanism for Monaural Singing Voice Separation

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters