Speaker identification: New vectors for SAUSI

Harry Hollien,Charles C Johnson,E T Doherty

doi:10.1121/1.2004070

Harry Hollien, Charles C Johnson + Show 1 more

Open Access

https://doi.org/10.1121/1.2004070

Copy DOI

Abstract

A semiautomatic speaker identification system (SAUSI) has been described (Proc. IEEE Conf. ASSP, 768–771 (1977)). This system employs four major vectors: (1) long-term speech spectra, (2) f0 analysis, (3) formant tracking, and (4) temporal analysis. Recently, the f0 analysis and temporal analysis vectors have been replaced by new ones each consisting of an expanded number of parameters. The new f0 vector is based on the mean and s.d. of the frequency distribution plus a parameter array consisting of the numbers of frequencies falling into 15–20 semitone interval “bins.” The temporal vector consists of an array of four 2–34 parameter subvectors as follows: (1) duration of speech rate (DSR), (2) voice/voiceless speech-time ratio (V/VL), (3) time-energy distribution (TED), and (4) vowel/consonant duration ratio (V/C). The nature and analysis techniques of these several vectors will be discussed as will their robustness as speaker identification cues in studies involving speaker and system distortions.

Full Text