Abstract

A comparative performance study of five pitch detection algorithms was conducted. A speech data base, consisting of eight utterances spoken by three males, three females, and one child was constructed. Both telephone and wideband recordings were made of each of the utterances. For each of the utterances in the data base a “standard” pitch contour was semiautomatically measured using a highly sophistocated interactive pitch detection program. The “standard” pitch contour was then compared with the pitch contour that was obtained from each of the five programmed pitch detectors. The algorithms used in this study were (1) a center clipping, infinite-peak clipping, modified autocorrelation method; (2) the cepstral method; (3) the SIFT method; (4) the parallel processing time domain method; and (5) the data reduction method. A set of measurements were made on the pitch contours to quantify the various types of errors which occur in each of the above methods. Included among the error measurements were the average and standard deviation of the error in pitch period during voiced regions, the number of gross errors in the pitch period, and the average and standard deviation of the error in choosing onset and offset of voicing. By pooling the various error measurements, the individual pitch detectors could be rank ordered as a measure of this relative performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.