Abstract

We propose a robust event-based method for estimation of the instantaneous fundamental frequency of a voiced speech signal. The amplitude and frequency modulated (AM-FM) signal model of voiced speech in the low frequency range (LFR) indicates the presence of energy only around its instantaneous fundamental frequency ( F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> ) and its few harmonics. The time-varying F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> component of a voiced speech signal is extracted by a robust algorithm which iteratively performs eigenvalue decomposition (EVD) of the Hankel matrix, initially constructed from samples of the LFR filtered voiced speech signal. The negative cycles of the extracted time-varying F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> component provide a reliable coarse estimate of intervals where glottal closure instants (GCIs) may be present. The negative cycles of the LFR filtered voiced speech signal occurring within these intervals are isolated. There is a sudden decrease in the glottal impedance at GCIs resulting in high signal strength. Therefore, GCIs are detected as local minima in the derivative of the falling edges of the isolated negative cycles of the LFR filtered voiced speech signal, followed by a selection criterion to discard false GCI candidates. The instantaneous F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> is estimated as the inverse of the time interval between two consecutive GCIs. Experiments were performed on the Keele and CSTR speech databases in white and babble noise environments at various levels of degradation to assess the performance of the proposed method. The proposed method substantially reduces the gross F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> estimation errors in comparison to some state of the art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call