Singing voice separation using mono-channel mask

Pallavi P Ingale,Sanjay L Nalbalwar

doi:10.1007/s10772-018-9509-6

Abstract

Separating singing voice from monaural song recording is a highly difficult task. Still it is important because it has many applications such as singer identification, lyrics recognition, and melody extraction. Difficulty arises due to many musical instruments involved and time-varying spectral overlap between singing voice and music. The goal of singing voice separation is to extract singing voice from the given monaural song recording with minimum artefacts and musical interference. We propose a three stage system for singing voice separation which helps to improve intelligibility and perceptual quality of the separated output. In the first stage, modified sub-harmonic summation algorithm finds pitch of the singing voice and its harmonic components. Here, we create a binary mask. In the second stage, frames i.e. the masked spectral amplitudes are classified as singing and non-singing frames by using a combination of Gammatone frequency cepstral coefficients (GFCC) and Mel-frequency cepstral coefficients (MFCC) features. Lastly, mono-channel mask is created and signal amplitude correction is done using kurtosis measure. We synthesize the estimate of singing voice using both binary mask and mono-channel mask. It is observed that the singing voice separated using mono-channel mask improves the GNSDR score. Performance of the proposed system is compared with the other methods, where it presents excellent improvement in terms of GNSDR. It produces higher GNSDR scores in case of two different datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Singing voice separation using mono-channel mask

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Apr 18, 2018
Citations: 1

Similar Papers

Novel Gammatone Filterbank Based Spectro-Temporal Features for Robust Phoneme Recognition
Ankit Nagpal ... Hemant A Patil
-
Ankit Nagpal, et. al.Ankit Nagpal ... Hemant A Patil
01 Jan 2017
01 Jan 2017

Optimizing Integrated Features for Hindi Automatic Speech Recognition System
Mohit Dua ... Mantosh Biswas
Journal of Intelligent Systems | VOL. 29
Mohit Dua, et. al.Mohit Dua ... Mantosh Biswas
01 Oct 2018
Journal of Intelligent Systems | VOL. 29

Mixture linear prediction Gammatone Cepstral features for robust speaker verification under transmission channel noise
Ahmed Krobba ... Sid-Ahmed Selouani
Multimedia Tools and Applications | VOL. 79
Ahmed Krobba, et. al.Ahmed Krobba ... Sid-Ahmed Selouani
09 Mar 2020
Multimedia Tools and Applications | VOL. 79

A Novel Singer Identification Method Using GMM-UBM
Xulong Zhang ... Juanjuan Li
-
Xulong Zhang, et. al.Xulong Zhang ... Juanjuan Li
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Singing voice separation using mono-channel mask

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology