Abstract

Separating singing voice from monaural song recording is a highly difficult task. Still it is important because it has many applications such as singer identification, lyrics recognition, and melody extraction. Difficulty arises due to many musical instruments involved and time-varying spectral overlap between singing voice and music. The goal of singing voice separation is to extract singing voice from the given monaural song recording with minimum artefacts and musical interference. We propose a three stage system for singing voice separation which helps to improve intelligibility and perceptual quality of the separated output. In the first stage, modified sub-harmonic summation algorithm finds pitch of the singing voice and its harmonic components. Here, we create a binary mask. In the second stage, frames i.e. the masked spectral amplitudes are classified as singing and non-singing frames by using a combination of Gammatone frequency cepstral coefficients (GFCC) and Mel-frequency cepstral coefficients (MFCC) features. Lastly, mono-channel mask is created and signal amplitude correction is done using kurtosis measure. We synthesize the estimate of singing voice using both binary mask and mono-channel mask. It is observed that the singing voice separated using mono-channel mask improves the GNSDR score. Performance of the proposed system is compared with the other methods, where it presents excellent improvement in terms of GNSDR. It produces higher GNSDR scores in case of two different datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.