Automatic selection of the number of poles for different gender and age groups in steady-state isolated vowels

Thayabaran Kathiresan,Volker Dellwo,Dieter Maurer

doi:10.1121/1.4969511

Abstract

Formant frequency estimation using a linear prediction (LPC) algorithm is based on the assumption of age- and gender-specific number of poles. However, when visually crosschecking the calculated formant frequencies along with a spectrogram, investigators often change the parameter because of a lack of correspondence. The misprediction is mainly due to a high variation within the calculated formant tracks, or tracks not matching with spectral peaks, possibly combined with an unexpected low or high number of occurring formants (e.g., formant merging, spurious formants). To solve the problem of changing the number of filter poles, we propose a new method which addresses the first aspect of the constancy of formant tracks. For a given vowel sound, the first three formant frequencies for three different settings (number of poles = 10, 12, and 14 for a frequency range 0-5.5 kHz) are calculated. The standard deviation of the formant tracks is used to find a Euclidean distance for three settings separately. The algorithm chooses the setting that produces least variability (minimum Euclidean distance) in steady-state vowel nuclei. We tested the method on vowel sounds of standard German /i, y, e, ø, ɛ, a, o, u/ produced by 14 men, 14 women, and 8 children.

Full Text