A robust scheme for distributed speech recognition over loss-prone packet channels

Angel M Gómez,Antonio M Peinado,Victoria Sánchez,Jose L Carmona

doi:10.1016/j.specom.2008.12.002

Angel M Gómez, Antonio M Peinado + Show 2 more

Open Access

https://doi.org/10.1016/j.specom.2008.12.002

Copy DOI

Journal: Speech Communication	Publication Date: Dec 24, 2008
Citations: 14	License type: other-oa

Affiliation: University of Granada

Abstract

In this paper, we propose a whole recovery scheme designed to improve robustness against packet losses in distributed speech recognition systems. This scheme integrates two sender-driven techniques, namely, media-specific forward error correction (FEC) and frame interleaving, along with a receiver-based error concealment (EC) technique, the weighted Viterbi algorithm (WVA). Although these techniques have been already tested separately, providing a significant increase of performance in clean acoustic environments, in this paper they are jointly applied and their performance in adverse acoustic conditions is evaluated. In particular, a noisy speech database and the ETSI Advanced Front-end are used, while the dynamic features, which play an important role in adverse acoustic environments, and their confidences for the WVA algorithm are examined. In order to solve the issue of mixing two sender-driven techniques (both causing a delay) whose direct composition causes an increase of the global latency, we propose a double stream scheme which limits the latency to the maximum delay of both techniques. As a result, with very few overhead bits and a very limited delay, the integrated scheme achieves a significant improvement in the performance of a DSR system over a degraded transmission channel, both in clean and noisy acoustic conditions.

Full Text