Graceful degradation of speech recognition performance over packet-erasure networks

C Boulis,S Otterson,E.A Riskin,M Ostendorf

doi:10.1109/tsa.2002.804532

Abstract

This paper explores packet loss recovery for automatic speech recognition (ASR) in spoken dialog systems, assuming an architecture in which a lightweight client communicates with a remote ASR server. Speech is transmitted with source and channel codes optimized for the ASR application, i.e., to minimize word error rate. Unequal amounts of forward error correction, depending on the data's effect on ASR performance, are assigned to protect against packet loss. Experiments with simulated packet loss in a range of loss conditions are conducted on the DARPA Communicator (air travel information) task. Results show that the approach provides robust ASR performance which degrades gracefully as packet loss rates increase. Transmitting at 5.2 Kbps with up to 200 ms added delay, leads to only a 7% relative degradation in word error rate even under extremely adverse network conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Graceful degradation of speech recognition performance over packet-erasure networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Speech and Audio Processing

Lead the way for us

Journal: IEEE Transactions on Speech and Audio Processing	Publication Date: Nov 1, 2002
Citations: 84

Similar Papers

Speaker verification based processing for robust ASR in co-channel speech scenarios
Seyed Omid Sadjadi ... Larry P Heck
-
Seyed Omid Sadjadi, et. al.Seyed Omid Sadjadi ... Larry P Heck
01 Jan 2014
01 Jan 2014

Analytic assessment of telephone transmission impact on ASR performance using a simulation model
Sebastian Möller ... Hervé Bourlard
Speech Communication | VOL. 38
Sebastian Möller, et. al.Sebastian Möller ... Hervé Bourlard
08 Mar 2002
Speech Communication | VOL. 38

Using HIPAA (Health Insurance Portability and Accountability Act)-Compliant Transcription Services for Virtual Psychiatric Interviews: Pilot Comparison Study.
Salman Seyedi ... Zifan Jiang
JMIR Mental Health | VOL. 10
Salman Seyedi, et. al.Salman Seyedi ... Zifan Jiang
31 Oct 2023
JMIR Mental Health | VOL. 10

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?
Brian D Tran ... Jennifer Elston Lafata
Journal of the American Medical Informatics Association | VOL. 30
Brian D Tran, et. al.Brian D Tran ... Jennifer Elston Lafata
23 Jan 2023
Journal of the American Medical Informatics Association | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Graceful degradation of speech recognition performance over packet-erasure networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Speech and Audio Processing