Abstract
The modern era in speaker recognition started about 50 years ago at Bell Laboratories with the controversial invention of the voiceprint technique for speaker identification based on expert analysis of speech spectrograms. Early speaker recognition research concentrated on finding acoustic-phonetic features effective in discriminating speakers. The first truly automatic text dependent speaker verification systems were based on time contours or templates of speaker specific acoustic features. An important element of these systems was the ability to time warp sample templates with model templates in order to provide useful comparisons. Most modern text dependent speaker verification systems are based on statistical representations of acoustic features analyzed as a function of time over specified utterances, most particularly the hidden markov model (HMM) representation. Modern text independent systems are based on vector quantization representations and, more recently, on Gaussian mixture model (GMM) representations. An important ingredient of statistically based systems is likelihood ratio decision techniques making use of speaker background models. Some recent research has shown how to extract higher level features based on speaking behavior and combine it with lower level, acoustic features for improved performance. The talk will present these topics in historical order showing the evolution of techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.