Automatic Speaker Recognition by Speech Signal

Milan Sigmund

doi:10.5772/6333

Abstract

Acoustical communication is one of the fundamental prerequisites for the existence of human society. Textual language has become extremely important in modern life, but speech has dimensions of richness that text cannot approximate. From speech alone, fairly accurate guesses can be made as to whether the speaker is male or female, adult or child. In addition, experts can extract from speech information regarding e.g. the speaker’s state of mind. As computer power increased and knowledge about speech signals improved, research of speech processing became aimed at automated systems for many purposes. Speaker recognition is the complement of speech recognition. Both techniques use similar methods of speech signal processing. In automatic speech recognition, the speech processing approach tries to extract linguistic information from the speech signal to the exclusion of personal information. Conversely, speaker recognition is focused on the characteristics unique to the individual, disregarding the current word spoken. The uniqueness of an individual’s voice is a consequence of both the physical features of the person vocal tract and the person mental ability to control the muscles in the vocal tract. An ideal speaker recognition system would use only physical features to characterize speakers, since these features cannot be easily changed. However, it is obvious that the physical features as vocal tract dimensions of an unknown speaker cannot be simply measured. Thus, numerical values for physical features or parameters would have to be derived from digital signal processing parameters extracted from the speech signal. Suppose that vocal tracts could be effectively represented by 10 independent physical features, with each feature taking on one of 10 discrete values. In this case, 1010 individuals in the population (i.e., 10 billion) could be distinguished whereas today’s world population amounts to approximately 7 billion individuals. People can reliably identify familiar voices. About 2-3 seconds of speech is sufficient to identify a voice, although performance decreases for unfamiliar voices. One review of human speaker recognition (Lancker et al., 1985) notes that many studies of 8-10 speakers (work colleagues) yield in excess of 97% accuracy if a sentence or more of the test speech is heard. Performance falls to about 54% when duration is shorter than 1 second and/or distorted e.g., severely highpass or lowpass filtered. Performance also falls significantly if training and test utterances are processed through different transmission systems. A study

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Speaker Recognition by Speech Signal

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Oct 1, 2008
Citations: 7	License type: cc-by-nc-sa

Similar Papers

Implicit language identification system based on random forest and support vector machine for speech
Manish Gupta ... Suneeta Agarwal
-
Manish Gupta, et. al.Manish Gupta ... Suneeta Agarwal
01 Mar 2017
01 Mar 2017

An Overview of the Concept of Speaker Recognition
Sindhu Rajendran ... Praveen Kumar Gupta
-
Sindhu Rajendran, et. al.Sindhu Rajendran ... Praveen Kumar Gupta
06 Dec 2019
06 Dec 2019

Monitoring Cognitive Workload Using Vocal Tract and Voice Source Features
Eydis Huld Magnusdottir ... Jon Gudnason
Periodica Polytechnica Electrical Engineering and Computer Science | VOL. 61
Eydis Huld Magnusdottir, et. al.Eydis Huld Magnusdottir ... Jon Gudnason
23 May 2017
Periodica Polytechnica Electrical Engineering and Computer Science | VOL. 61

Speaker Identification by Combining Various Vocal Tract and Vocal Source Features
Yuta Kawakami ... Atsuhiko Kai
-
Yuta Kawakami, et. al.Yuta Kawakami ... Atsuhiko Kai
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Speaker Recognition by Speech Signal

Abstract

Talk to us

Similar Papers