This paper deals with study of formant and harmonic contours by processing the group delay (GD) spectrograms of speech signals. The GD spectrum is the negative derivative of the phase spectrum with respect to frequency. Recent study shows that the GD spectrogram can be obtained without phase wrapping. Formant frequency contours can be observed in the display of the peaks of the instantaneous wideband equivalent GD spectrogram, derived using the modified single frequency filtering (SFF) analysis of speech signals. Harmonic frequency contours can be observed in the display of the peaks of the instantaneous narrowband equivalent GD spectrogram, derived using the modified SFF analysis of speech signals. For synthetic speech signals, the observed formant contours match the ground truth formant contours from which the signal is derived. For natural speech signals, the observed formant contours match approximately with the given ground truth formant contours mostly in the voiced regions. The results are illustrated for several randomly selected utterances from the TIMIT database. While this study helps to observe the contours of formants in the display, automatic extraction of the formant frequencies needs further processing, requiring logic for eliminating the spurious points, without forcing the number of formants.
Read full abstract