A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting

J Bartošek

doi:10.14311/1422

Abstract

This paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viterbi traceback procedure is used for post-processing the MNFBC output considering the three best fundamental frequency (F0) candidates in each step. This should make the final pitch contour smoother, and should also prevent octave errors. In transition probabilities computation between F0 candidates, two major improvements were made over existing post-processing methods. Firstly, we compare pitch distance in musical cent units. Secondly, temporal forgetting is applied in order to avoid penalizing pitch jumps after prosodic pauses of one speaker or changes in pitch connected with turn-taking in dialogs. Results computed on a pitchreference database definitely show the benefit of the first improvement, but they have not yet proved any benefits of temporal modification. We assume this only happened due to the nature of the reference corpus, which had a small amount of suprasegmental content.

Highlights

Almost every audible sound tends to have a fundamental frequency
MNBFCv1 is the basic variant with the voiced/unvoiced (V/UV) decision threshold set to value 0.5 and with the transition probability of the Viterbi procedure computed from the direct frequency difference
To compare our method with other widely used methods, we added the results for autocorrelation in the frequency domain (ACF freq, a very good method for tracking singing) and the Direct Frequency Estimation method (DFE) [8], which is currently used for evaluating Parkinson’s disease at FEE CTU in Prague

Summary

Introduction

Almost every audible sound tends to have a fundamental frequency. This is the lowest frequency on which the signal is periodic, and we sense this frequency as the height (pitch) of the sound. Human speech perception is partly based on intonation (changes of pitch), which is an aspect of prosody. Thanks to this we can distinguish whether a person is making a statement or a question [1]. A motivation for finding a precise and robust PDA could be to track the intonation contour in continuous speech This is a crucial step for the proper function e.g. of a punctuation detector [2] or an emotion classifier of the speaker. There are nowadays several known pitch detection methods They can generally be divided according to the domain in which they operate (time, frequency, cepstrum, etc.) An overview of some basic methods can be found in [12]. AMDF [5] (time domain), the cepstral method [4] (modification of the spectrum domain) and sub-harmonic summation (SHS) [3] are well described and widely used methods

A description of PDA using MNFBC

Viterbi post-processing

Test conditions

Evaluation criteria

Results and discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Acta Polytechnica	Publication Date: Jan 5, 2011
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica

Lead the way for us

Similar Papers

Comparative performance of pitch detection algorithms on dysphonic voices
J Laver ... R Hanson
-
J Laver, et. al.J Laver ... R Hanson
01 May 1982
01 May 1982

PLL-based Pitch Detection and Tracking for Audio Signals
Udo Zolzer ... Sujaritha Vettukadu Sankarababu
-
Udo Zolzer, et. al.Udo Zolzer ... Sujaritha Vettukadu Sankarababu
01 Jul 2012
01 Jul 2012

Improving AMDF for pitch period detection
Kang Guangyu ... Guo Shize
-
Kang Guangyu, et. al.Kang Guangyu ... Guo Shize
01 Aug 2009
01 Aug 2009

An experimental investigation of discrete changes in pitch in a thin, planar chiral nematic device
H G Yoon ... H F Gleeson
Liquid Crystals | VOL. 33
H G Yoon, et. al.H G Yoon ... H F Gleeson
01 Apr 2006
Liquid Crystals | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Polytechnica