Performance of an Event-Based Instantaneous Fundamental Frequency Estimator for Distant Speech Signals

Guruprasad Seshadri,B Yegnanarayana

doi:10.1109/tasl.2010.2101595

Abstract

This paper proposes a method for extracting the fundamental frequency of voiced speech from distant speech signals. The method is based on the impulse-like nature of excitation in voiced speech. The characteristics of impulse-like excitation are extracted by filtering the speech signal through a cascade of resonators located at zero frequency. The resulting filtered signal preserves information specific to the fundamental frequency, in the sequence of positive-to-negative zero crossings. Also, the filtered signal is free from the effects of resonances of the vocal tract. An estimate of the fundamental frequency is derived from the short-time spectrum of the filtered signal. This estimate is used to remove spurious zero crossings in the filtered signal. The proposed method depends only on the strengths of impulse-like excitations in the direct component of distant speech signals, and not on the similarity of speech signal in successive glottal cycles. Hence, the method is robust to the effects of reverberation and noise. Performance of the method is evaluated using a database of close-speaking and distant speech signals. Experiments show that the accuracy of the proposed method is significantly higher than that of existing methods based on time-domain and frequency-domain processing.

Full Text