Abstract

Two experiments were carried out in which long-term spectra were extracted from controlled speech samples in order to study the effectiveness of that technique as a cue for speaker identification. In the first study, power spectra were computed separately for groups of 50 American and 50 Polish male speakers under fullband and passband conditions; an n-dimensional Euclidean distance technique was used to permit identifications. The procedure resulted in high levels of speaker identification for these large groups—especially under the fullband conditions. In a second experiment, the same approach was employed in order to discover if it was resistant to the effects of variation in speech production—at least under laboratory conditions. Talkers were 25 adult American males; three different speaker conditons were studied: (a) normal speech, (b) speech during stress, and (c) disguised speech. The results demonstrated high levels of correct speaker identification for normal speech, slightly reduced scores for speech during stress and markedly reduced correct identifications for disguised speech. It would appear that long-term speech spectra can be utilized to identify individuals from their speech—even in relatively large groups—when they are speaking normally or under stress (of the type studied); LTS does not appear to be an effective technique when voice disguise is employed. While this approach was utilized only in controlled laboratory experiments, it is suggested that it may have some merit for use in applied situations or as one of the features in a multiple-vector approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call