Abstract

A semiautomatic speaker identification system (SAUSI) has been described (Proc. IEEE Conf. ASSP, 768–771 (1977)). This system employs four major vectors: (1) long-term speech spectra, (2) f0 analysis, (3) formant tracking, and (4) temporal analysis. Recently, the f0 analysis and temporal analysis vectors have been replaced by new ones each consisting of an expanded number of parameters. The new f0 vector is based on the mean and s.d. of the frequency distribution plus a parameter array consisting of the numbers of frequencies falling into 15–20 semitone interval “bins.” The temporal vector consists of an array of four 2–34 parameter subvectors as follows: (1) duration of speech rate (DSR), (2) voice/voiceless speech-time ratio (V/VL), (3) time-energy distribution (TED), and (4) vowel/consonant duration ratio (V/C). The nature and analysis techniques of these several vectors will be discussed as will their robustness as speaker identification cues in studies involving speaker and system distortions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.