Abstract

We investigate an audio scene consisting of two main sound sources: (i) instrumental music and (ii) speech sound. To date, independent component analysis (ICA) has emerged as a powerful technique for blind source separation tasks. However, ICA does not handle a dynamic mixture of sources. In this paper, we investigate this issue and propose a two-pass framework: in the first pass, the system segments the mixed-source input into different chunks based on the similarity of the audio features, in the second pass, the system applies ICA to each segmented chunk. We argue that different mixtures of sources have different audio characteristics. These characteristics can be extracted using machine learning techniques. The extracted features are used to segment the mixed-source input into different chunks. Performing source separation on these chunks yields a better extraction of the original sources than performing a source separation without segmentation. We present the framework, experimental design and results from our proposed approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.