Abstract

A pre-processor to speech recognition, audio source separation may mitigate the problem of quality degradation of individual signal recognition in scenarios like cock-tail party environment. The same may be used for various other applications like audio forensics, speaker verification, instrument identification, hearing aids, etc. There are various techniques available for single channel audio source separation, but the technique based on Non-negative Matrix Factorization (NMF) is widely used. Several research studies have shown considerable performance improvement of signal separation using NMF on different mixture of audio signals like speech with noise, speech with music, speech with speech taken from different audio databases. In this paper, single channel source separation using Non-Negative Matrix Factorization and its variants for two-speaker mixed signal is investigated using same speech database, the GRID speech corpus. The separation performances of phase-aware algorithms are compared with phase-unaware approaches based on NMF and its variants. The quality of separated speech was judged by varying parameters such as number of bases and analysis window size.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.