Abstract

Modified group delay functions are beginning to gain significance in the literature for formant estimation, speaker recognition and speech recognition. In particular, group delay functions have the property that they possess higher resolution compared to that of the magnitude spectrum. In this paper, modified group delay functions are used for the estimation and tracking of two pitches in concurrent speech. The power spectrum of the speech signal is first flattened to annihilate the system characteristics, while retaining the source characteristics. Group delay analysis of the flattened spectrum is performed and the predominant pitch is computed. Next, a comb filter is designed to remove the predominant pitch and its harmonics from the group delay spectrum. The residual spectrum is again subjected to group delay analysis and the next candidate pitch is again estimated using modified group delay processing. The first and second pass pitch trajectories are corrected using post processing. The performance of the proposed algorithm was evaluated on two datasets using two metrics; pitch accuracy and standard deviation of fine pitch error. Our results show that phase based processing holds promise in the context of multipitch estimation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call