Abstract

ABSTRACT Humans are exceptionally good at focusing their attention on a particular person in a noisy environment, mentally muting all other voices and sounds known as cocktail party effect, and this capability comes naturally to them. Although human brain and auditory system can handle this problem with ease, it becomes very hard to solve with computational algorithms. A novel technique is proposed in this paper for separating speech signals, when the number of sources is more than the sensors. Here wavelets are used in initial pre-processing and time-frequency analysis of mixtures to extract the mixing variables which makes it useful in applications involving real-time scenario. The algorithm also converges at a faster rate as we use automatic peak tracking algorithm to track the peak and constructing binary masks. The simulation results on mixtures of speech signals not only show improvement in separation in reverberant conditions but also show better separation results and improved perceptual quality of separated sources in real-world noisy environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call