Abstract

This paper tries to attempt a review on deep neural network (DNN) method in music source separation (MSS) tools with emphasis to Spleeter by Deezer, an enhanced deep learning model for music sourceseparation. It is a set of pre-trainedmodel written in python using the Tensorflow machine learning library used for musicsource separation. It was developed by Deezer, on the need to separate a given mixed music track to its constituentinstrumental or vocal tracks usually known as stems. Spleeter offers 3 pre-trainedmodels namely 2, 4, and 5 stemseparation models that are capable of separating a given mix into 2, 4, and 5 stems respectively, which can be used forvarious needs like remixing, up-mixing,music transcription, etc. This paper is the first of its kind to review on DNN methods in MSS.In this paper, we will learn about the purpose and useof Spleeter developed by Deezer as well as about the technical aspect behind this software product that includes areas like ArtificialIntelligence (AI), Machine Learning and Deep Learning, and further about Time-Frequency (TF) masking and U-NetConvolution Neural Network (CNN) which are the methodology and architecture employed in it respectively. From thereview, we learned that Spleeter by Deezer is one of the latest advancement in MSS problem that comparatively has one of the best signal to distortion ratio (SDR), signal to artifacts ratio (SAR), signal to interference ratio (SIR), and sourceimage to spatial distortion ratio (ISR) and produce a state of the art solution, and it has also paved a way togreater development in MSS problem in the future.

Highlights

  • Spleeter by Deezer is a set of pre-trained models written in python using the Tensor flow machine learning library used for music source separation (MSS)

  • This paper is the first of its kind to review on deep neural network (DNN) methods in MSS.In this paper, we will learn about the purpose and useof Spleeter developed by Deezer as well as about the technical aspect behind this software product that includes areas like ArtificialIntelligence (AI), Machine Learning and Deep Learning, and further about Time-Frequency (TF) masking and UNetConvolution Neural Network (CNN) which are the methodology and architecture employed in it respectively

  • This paper tries to attempt a review onDNN method in MSS tools with a case study on Spleeter which is developed by Deezer, a set of pre-trained models written in python using the Tensorflow machine learning library used for music source separation

Read more

Summary

Introduction

Spleeter by Deezer is a set of pre-trained models written in python using the Tensor flow machine learning library used for music source separation (MSS) These models are already trained and show state-of-the-art performance in MSS. In the year 2019, RomainHennequin et al, [5] presented and released Spleeter which is a new tool for music source separation with pre-trained models This software separated the audio files into 2, 4, or 5 stems with a single command line using pre-trained models. Alexandre Défossez et al, [7] published Demucs, which is a Deep Extractor for Music Sources with extra unlabelled data that is remixed They considered four sources for their works: drums, bass, vocals, and other accompaniments; and came up with a RNN model that outperformed the existing state-of-the-art waveforms. Results revealed that their method improved the performances of Open-Unmix, a well-known model

Artificial Intelligence
Machine Learning
Deep Learning
Advantages
Challenges
Conclusions and Future scope
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call