Speaker identification by long-term spectra under normal, stress, and disguise conditions

H Hollien,P Hollien,W Majewski

doi:10.1121/1.1919594

Abstract

The authors have developed a method of speaker identification based on long-term statistical measures of speech. That is, n-dimensional Euclidian distances between long-term speech spectra are calculated and used in the identification of speakers. In this experiment, the technique was evaluated in order to discover if it was resistant to the effects of variations in the mode of speech production and signal transmission. Three different speaking conditions were studied: (a) normal speech. (b) speech under stress-talkers were subjected to randomly distributed electric shocks while speaking, and (c) disguised speech-talkers were permitted to disguise their speech in any manner they chose except by whispering or the use of a foreign dialect. Moreover, the entire procedure was replicated for a restricted passband of 300–3500 Hz; this band is similar to that found in telephone transmissions. Speech samples were obtained from 25 adult American males who read Stevenson's “Apology for Idlers” under the three different experimental conditions. A portion of the tape-recorded reading was analyzed in 1/3-octave bands by means of a GR-1925 Multifilter and a GR-1926 Multichannel rms Detector; four speech samples of 32-sec duration were analyzed for each subject and experimental condition. Further processing of the spectral results (expressed in decibel levels for each frequency band) was carried out on an IBM 370/175 computer. The normalized data were used to obtain Euclidian distances; in turn, they were utilized to evaluate both intra- and interspeaker variations in the speech spectra for the different speaking and (parallel) passband conditions. The normalized mean values of the four subsamples produced by each speaker under all conditions constituted the set of reference samples, and two different sets of test samples were studied. The reference data were used to discriminate among the speakers in the normal speaking mode; they were then used in an attempt to identify the speakers in the stress and disguise conditions. The entire process was replicated for the passband condition. The results of the research relative to the normal mode procedure demonstrated a relatively high level of correct speaker identification (slightly over 90%); the correct identification level was reduced by nearly 20% for the passband mede. The identification levels for the stress condition were nearly as high as they were for the normal mode but the correct speaker identification level for disguise was little better than chance. Replication of the procedures for the passband condition resulted in a slight further degradation of the stress condition but a marked improvement with respect to disguise. It appears that this method can identify individuals from their speech reasonably well when they are speaking normally or under stress; it cannot do so, however, when they attempt to disguise their voices.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker identification by long-term spectra under normal, stress, and disguise conditions

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Journal: The Journal of the Acoustical Society of America	Publication Date: Apr 1, 1974
Citations: 8

Similar Papers

Speaker recognition under stressed condition
G Senthil Raja ... S Dandapat
International Journal of Speech Technology | VOL. 13
G Senthil Raja, et. al.G Senthil Raja ... S Dandapat
03 Jun 2010
International Journal of Speech Technology | VOL. 13

Perceptual identification of voices under normal, stress and disguise speaking conditions
Harry Hollien ... Wojciech Majewski
Journal of Phonetics | VOL. 10
Harry Hollien, et. al.Harry Hollien ... Wojciech Majewski
01 Apr 1982
Journal of Phonetics | VOL. 10

Speaker indentification utilizing selected temporal speech features
Charles C Johnson ... James W Hicks
Journal of Phonetics | VOL. 12
Charles C Johnson, et. al.Charles C Johnson ... James W Hicks
01 Oct 1984
Journal of Phonetics | VOL. 12

Intra- and inter-speaker variations of formant pattern for lateral syllables in Standard Chinese
Cuiling Zhang ... Jingxu Cui
Forensic Science International | VOL. 158
Cuiling Zhang, et. al.Cuiling Zhang ... Jingxu Cui
20 Jul 2005
Forensic Science International | VOL. 158

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker identification by long-term spectra under normal, stress, and disguise conditions

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America