The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses

Javier Tejedor,Jose Ramirez,Ana Montalvo,Juan Alvarez-Trejos,Doroteo Toledano

doi:10.3390/app11188519

Abstract

The large amount of information stored in audio and video repositories makes search on speech (SoS) a challenging area that is continuously receiving much interest. Within SoS, spoken term detection (STD) aims to retrieve speech data given a text-based representation of a search query (which can include one or more words). On the other hand, query-by-example spoken term detection (QbE STD) aims to retrieve speech data given an acoustic representation of a search query. This is the first paper that presents an internationally open multi-domain evaluation for SoS in Spanish that includes both STD and QbE STD tasks. The evaluation was carefully designed so that several post-evaluation analyses of the main results could be carried out. The evaluation tasks aim to retrieve the speech files that contain the queries, providing their start and end times and a score that reflects how likely the detection within the given time intervals and speech file is. Three different speech databases in Spanish that comprise different domains were employed in the evaluation: the MAVIR database, which comprises a set of talks from workshops; the RTVE database, which includes broadcast news programs; and the SPARL20 database, which contains Spanish parliament sessions. We present the evaluation itself, the three databases, the evaluation metric, the systems submitted to the evaluation, the evaluation results and some detailed post-evaluation analyses based on specific query properties (in-vocabulary/out-of-vocabulary queries, single-word/multi-word queries and native/foreign queries). The most novel features of the submitted systems are a data augmentation technique for the STD task and an end-to-end system for the QbE STD task. The obtained results suggest that there is clearly room for improvement in the SoS task and that performance is highly sensitive to changes in the data domain.

Highlights

Licensee MDPI, Basel, Switzerland.The huge amount of information stored in audio and audiovisual repositories makes it necessary to develop efficient methods for search on speech (SoS)
Three databases that comprise different acoustic conditions and domains were employed for the evaluation: the workshop talks MAVIR and broadcast news RTVE databases, which were used in previous ALBAYZIN SoS evaluations, and the SPARL20 database, which was the new one added for this evaluation and which contains speech from Spanish parliament sessions held from 2016
We present the results obtained by the systems submitted to the evaluation for both the spoken term detection (STD) and the query-by-example spoken term detection (QbE STD) tasks, and both for the development and test data

Summary

Introduction

Licensee MDPI, Basel, Switzerland. The huge amount of information stored in audio and audiovisual repositories makes it necessary to develop efficient methods for search on speech (SoS). Significant research has been carried out for years in this area, and, in particular, in the tasks of spoken document retrieval (SDR) [1,2,3,4,5,6], keyword spotting (KWS) [7,8,9,10,11,12,13], spoken term detection (STD) [14,15,16,17,18,19,20,21,22,23,24,25] and query-by-example spoken term detection (QbE STD) [26,27,28,29,30,31]. 4.0/).

Spoken Term Detection Overview

Query-by-Example Spoken Term Detection Overview

Spoken Term Detection

Query-by-Example Spoken Term Detection

Evaluation Summary

Databases

SPARL20

Query List Selection

Evaluation Metrics

Comparison with Previous Search on Speech International Evaluations

Comparison with Previous STD International Evaluations

Evaluation

Comparison with Previous Qbe STD International Evaluations

Comparison with Previous Search on Speech Albayzin Evaluations

Systems

Results and Discussion

Development Data

Test Data

System Analysis for In-Language and Out-of-Language Queries

System Analysis for Single and Multi-Word Queries

System Analysis for In-Vocabulary and Out-of-Vocabulary Queries

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Sep 14, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

ALBAYZIN 2018 spoken term detection evaluation: a multi-domain international evaluation in Spanish
Javier Tejedor ... Laura Docio-Fernandez
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2019
Javier Tejedor, et. al.Javier Tejedor ... Laura Docio-Fernandez
02 Sep 2019
ALBAYZIN 2018 spoken term detection evaluation: a multi-domain international evaluation in Spanish
Javier Tejedor ... Laura Docio-Fernandez

Entropy-based false detection filtering in spoken term detection tasks
Satoshi Natori ... Yuto Furuya
-
Satoshi Natori, et. al.Satoshi Natori ... Yuto Furuya
01 Oct 2013
01 Oct 2013

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion
Javier Tejedor ... Laura Docio-Fernandez
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2015
Javier Tejedor, et. al.Javier Tejedor ... Laura Docio-Fernandez
07 Aug 2015
Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion
Javier Tejedor ... Laura Docio-Fernandez

Search on speech from spoken queries: the Multi-domain International ALBAYZIN 2018 Query-by-Example Spoken Term Detection Evaluation
Javier Tejedor ... Luis Javier Rodriguez-Fuentes
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2019
Javier Tejedor, et. al.Javier Tejedor ... Luis Javier Rodriguez-Fuentes
19 Jul 2019
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences