Fifty years of progress in speech and speaker recognition

Sadaoki Furui

doi:10.1121/1.4784967

Abstract

Speech and speaker recognition technology has made very significant progress in the past 50 years. The progress can be summarized by the following changes: (1) from template matching to corpus-base statistical modeling, e.g., HMM and n-grams, (2) from filter bank/spectral resonance to Cepstral features (Cepstrum + DCepstrum + DDCepstrum), (3) from heuristic time-normalization to DTW/DP matching, (4) from gdistanceh-based to likelihood-based methods, (5) from maximum likelihood to discriminative approach, e.g., MCE/GPD and MMI, (6) from isolated word to continuous speech recognition, (7) from small vocabulary to large vocabulary recognition, (8) from context-independent units to context-dependent units for recognition, (9) from clean speech to noisy/telephone speech recognition, (10) from single speaker to speaker-independent/adaptive recognition, (11) from monologue to dialogue/conversation recognition, (12) from read speech to spontaneous speech recognition, (13) from recognition to understanding, (14) from single-modality (audio signal only) to multi-modal (audio/visual) speech recognition, (15) from hardware recognizer to software recognizer, and (16) from no commercial application to many practical commercial applications. Most of these advances have taken place in both the fields of speech recognition and speaker recognition. The majority of technological changes have been directed toward the purpose of increasing robustness of recognition, including many other additional important techniques not noted above.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fifty years of progress in speech and speaker recognition

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Journal: The Journal of the Acoustical Society of America	Publication Date: Oct 1, 2004
Citations: 135

Similar Papers

An instantiable speech biometrics module with natural language interface: implementation in the telephony environment
J Navratil ... S.H Maes
-
J Navratil, et. al.J Navratil ... S.H Maes
05 Jun 2000
05 Jun 2000

Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks
Gurpreet Kaur ... Mohit Srivastava
Journal of Telecommunications and Information Technology | VOL. 2
Gurpreet Kaur, et. al.Gurpreet Kaur ... Mohit Srivastava
29 Jun 2018
Journal of Telecommunications and Information Technology | VOL. 2

An Overview of the Concept of Speaker Recognition
Sindhu Rajendran ... Praveen Kumar Gupta
-
Sindhu Rajendran, et. al.Sindhu Rajendran ... Praveen Kumar Gupta
06 Dec 2019
06 Dec 2019

Selected topics from 40 years of research on speech and speaker recognition
Sadaoki Furui
-
Sadaoki FuruiSadaoki Furui
06 Sep 2009
06 Sep 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fifty years of progress in speech and speaker recognition

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America