Two‐stage spoken term detection system for under‐resourced languages

Deekshitha G,Leena Mary

doi:10.1049/iet-spr.2019.0131

Abstract

Spoken Term Detection (STD) is the process of locating the occurrences of spoken queries in a given speech database. Generally, two methods are adopted for STD: an ASR based sequence matching and ASR-free, feature-based template matching. If a well-performing ASR is available, the former STD method is accurate. However, to build an ASR with consistent performance, several hours of labelled corpora is required. Template matching methods work well for small or chopped utterances. However, in practice, the volume of the search database can be huge, containing sentences of varying lengths. Hence time complexity of template matching techniques will be high, which makes them impractical for realistic search applications. In this work, a two-stage STD system is proposed, which combines the ASR-based phoneme sequence matching in the first stage and feature sequence template matching of selected locations in the second stage. The time complexity of the second stage is reduced by performing DTW-based template matching only at probable query locations identified by the first stage. ‘Split and match’ approach helps to reduce the false-positives in case of longer query words. Effectiveness of the proposed method is demonstrated using English and Malayalam datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two‐stage spoken term detection system for under‐resourced languages

Abstract

Talk to us

Similar Papers

More From: IET Signal Processing

Lead the way for us

Journal: IET Signal Processing	Publication Date: Oct 6, 2020
Citations: 1

Similar Papers

Robust Query-by-example Spoken Term Detection for Unknown Words Using Speech Retrieval-oriented E2E ASR Modeling
Takumi Kurokawa ... Atsuhiko Kai
-
Takumi Kurokawa, et. al.Takumi Kurokawa ... Atsuhiko Kai
12 Oct 2021
12 Oct 2021

Sparse Subspace Modeling for Query by Example Spoken Term Detection
Dhananjay Ram ... Herve Bourlard
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 26
Dhananjay Ram, et. al.Dhananjay Ram ... Herve Bourlard
01 Jun 2018
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 26

Spoken term detection from noisy input
Gabor Gosztolya ... Laszlo Toth
-
Gabor Gosztolya, et. al.Gabor Gosztolya ... Laszlo Toth
01 May 2011
01 May 2011

Experimental studies on effect of speaking mode on spoken term detection
Kallola Rout ... Pappagari Raghavendra Reddy
-
Kallola Rout, et. al.Kallola Rout ... Pappagari Raghavendra Reddy
01 Feb 2015
01 Feb 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two‐stage spoken term detection system for under‐resourced languages

Abstract

Talk to us

Similar Papers

More From: IET Signal Processing