Abstract

In applications such as noise removal, speech separation, music reconstruction or integration, and many others, it is important to decompose a single-channel recorded sound into its sources. In this paper, we deal with the separation of single-channel audio sources and cocktail party problem in audio data using supervised learning. The basis of this method is using deep learning to separate audio signals. The data contains audio-visual videos and Persian musical instruments sounds which have been selected as audio sources. The proposed algorithm is the first method implemented based on deep learning to separate the sound sources of Persian musical instruments. In this method, first the audio part of the video signal is extracted, then the audio signal, which is a mixture of sound sources, is separated and decomposed by a deep neural network model. According to the evaluations, the proposed method has good quantitative and qualitative results for separation two audio sources. In addition, this method is a comprehensive solution that can be used for a wide range of audio sources and a combination of more than two audio sources.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.