Abstract

Speech undergoes various acoustic interferences in natural environment, while many of the applications require an effective way to separate the dominant signal from the interference. In this paper, a Short-time Fourier Transform (STFT) based unsupervised method for single channel speech separation is proposed. It uses the pitch information of the dominant and interfering speakers and then generating a time frequency mask based on the pitch frequencies. Through rigorous objective and subjective evaluations, it is shown that the proposed system is capable of providing better Signal to Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ) compared to other related methods available in the literature.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call