Keyword Search Based on Unsupervised Pre-Trained Acoustic Models

Xiner Li,Shen Huang,Jing Zhao,Wei-Qiang Zhang,Zhiqiang Lv

doi:10.1142/s2717554522500059

Abstract

Speech keyword search (KWS) is the task of automatically detecting the required keywords in continuous speech. Single-keyword detection can be regarded as the task of speech keyword wake-up. For many practical applications of these small vocabulary speech recognition tasks, it is costly and unnecessary to build a full large vocabulary speech recognition system. For tasks related to speech keyword search, insufficiency in data resources remains the main challenge so far. Speech pre-training has become an effective technique, showing its superiority in a variety of tasks. The key idea is to learn effective representations in settings where a large amount of unlabeled data is available to improve the performance while labeled data of downstream tasks are limited. This research focuses on the combination of unsupervised pre-training and keyword search based on the Keyword-Filler model and introduces unsupervised pre-training into speech keyword search. The research selects pre-trained model architecture Wav2vec2.0 including XLSR. The research results show that training with feature extracted by pre-trained model performs better than the baseline. In the case of low-resource condition, the baseline performance drops significantly, while the performance of the pre-trained tuned model does not decrease but even increases slightly in some intervals. It can be seen that the pre-trained model can be tuned to achieve better performance on very little data. This shows the advantage and application value of keyword search based on unsupervised pre-training.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Keyword Search Based on Unsupervised Pre-Trained Acoustic Models

Abstract

Talk to us

Similar Papers

More From: International Journal of Asian Language Processing

Lead the way for us

Journal: International Journal of Asian Language Processing	Publication Date: Sep 1, 2021
Citations: 1

Similar Papers

Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-Trained Models
Xiaoyu Yang ... Qiujia Li
-
Xiaoyu Yang, et. al.Xiaoyu Yang ... Qiujia Li
23 May 2022
23 May 2022

Abcbpc at SemEval-2021 Task 7: ERNIE-based Multi-task Model for Detecting and Rating Humor and Offense
Chao Pang ... Xuan Ouyang
-
Chao Pang, et. al.Chao Pang ... Xuan Ouyang
01 Jan 2020
Abcbpc at SemEval-2021 Task 7: ERNIE-based Multi-task Model for Detecting and Rating Humor and Offense
Chao Pang ... Xuan Ouyang

Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Mohamad Ballout ... Kai-Uwe Kühnberger
Procedia Computer Science | VOL. 222
Mohamad Ballout, et. al.Mohamad Ballout ... Kai-Uwe Kühnberger
01 Jan 2023
Procedia Computer Science | VOL. 222

Transfer Learning and Fine-Tuning for Deep Learning-Based Tea Diseases Detection on Small Datasets
Ade Ramdan ... Ana Heryana
-
Ade Ramdan, et. al.Ade Ramdan ... Ana Heryana
18 Nov 2020
18 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Keyword Search Based on Unsupervised Pre-Trained Acoustic Models

Abstract

Talk to us

Similar Papers

More From: International Journal of Asian Language Processing