Conversational speech recognition

Thomas H Crystal

doi:10.1121/1.420695

Abstract

A hot topic in speech recognition is developing technology for the automatic transcription of telephone conversations. The recognizer must contain robust language, pronunciation, and acoustic models that embody the world and topic knowledge and the understanding of syntax and pronunciation, which the talkers have and use in decoding each other’s acoustic signals. Partly because of the talkers’ shared knowledge and the casual, unprepared nature of the speech, the signals have dysfluencies, incomplete and ungrammatical expressions, and ‘‘lazy,’’ reduced articulation of words. Conversational speech recognition error rates, measured in the NIST Hub-5 evaluations, are 45% for English and 66% to 75% for Spanish, Mandarin, and Arabic. To improve this performance, the shared knowledge must be represented in a mathematical framework, which facilitates the efficient search of the sentences of a language to decode the speech. Recent work, including workshops at Rutgers CAIP and Johns Hopkins CLSP, has included the investigation of, among other techniques, multistream processing, frequency warping, adaptation of pronunciation and acoustic models of phones, pronunciation modeling, syllable-based recognition, dysfluency and discourse-state language models, and link grammar parsing. This talk will review how knowledge is represented in the recognizer architecture, searching procedures, and the results of the various investigations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Conversational speech recognition

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Journal: The Journal of the Acoustical Society of America	Publication Date: Nov 1, 1997
Citations: 1

Similar Papers

A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
Yoo Rhee Oh ... Hong Kook Kim
IEICE Transactions on Information and Systems | VOL. E93-D
Yoo Rhee Oh, et. al.Yoo Rhee Oh ... Hong Kook Kim
01 Jan 2009
IEICE Transactions on Information and Systems | VOL. E93-D

A hybrid approach to adapting acoustic and pronunciation models for non-native speech recognition
Yoo Rhee Oh ... Hong Kook Kim
-
Yoo Rhee Oh, et. al.Yoo Rhee Oh ... Hong Kook Kim
01 Jan 2009
01 Jan 2009

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition
William Chan ... Oriol Vinyals
-
William Chan, et. al.William Chan ... Oriol Vinyals
01 Mar 2016
01 Mar 2016

Pervasive unsupervised adaptation for lecture speech transcription
D Willettt ... Y Minami
-
D Willettt, et. al.D Willettt ... Y Minami
06 Apr 2003
06 Apr 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Conversational speech recognition

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America