Abstract

While many readily available pitch tracking algorithms are capable of accurately tracking pitch on studio quality speech data, robust performance in real operational environments is still an elusive goal. In this paper, three pitch detection algorithms are evaluated over a database consisting of speech data collected over a wide range of telephone lines including long distance exchanges. The speech material contained in the database consists of excerpts from typical telephone conversations, collected at the receiving end of a two party exchange. Subjective and objective evaluations were conducted on three pitch tracking algorithms: an improved version of the Integrated Correlation[1,2] pitch tracker, the Gold-Rabiner [3,4] parallel processing algorithm, and the NSA LPC-10 DYPTRACK Version 43 [5-8] algorithm. A comparative analysis of these algorithms indicates that the integrated correlation pitch tracker provides significantly better performance, mainly due to its ability to make accurate voicing decisions in noisy environments. In addition, intelligibility tests demonstrate that synthetic speech intelligibility is correlated with the degree of accuracy of pitch estimation. This result reinforces our belief that accurate pitch tracking is crucial to the operational acceptance of the speech quality produced by the LPC pitch-excited vocoder.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call