Efficient data selection for ASR

Neil Taylor Kleynhans,Etienne Barnard

doi:10.1007/s10579-014-9285-0

Abstract

Automatic speech recognition (ASR) technology has matured over the past few decades and has made significant impacts in a variety of fields, from assistive technologies to commercial products. However, ASR system development is a resource intensive activity and requires language resources in the form of text annotated audio recordings and pronunciation dictionaries. Unfortunately, many languages found in the developing world fall into the resource-scarce category and due to this resource scarcity the deployment of ASR systems in the developing world is severely inhibited. One approach to assist with resource-scarce ASR system development, is to select “useful” training samples which could reduce the resources needed to collect new corpora. In this work, we propose a new data selection framework which can be used to design a speech recognition corpus. We show for limited data sets, independent of language and bandwidth, the most effective strategy for data selection is frequency-matched selection and that the widely-used maximum entropy methods generally produced the least promising results. In our model, the frequency-matched selection method corresponds to a logarithmic relationship between accuracy and corpus size; we also investigated other model relationships, and found that a hyperbolic relationship (as suggested from simple asymptotic arguments in learning theory) may lead to somewhat better performance under certain conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient data selection for ASR

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation

Lead the way for us

Journal: Language Resources and Evaluation	Publication Date: Oct 14, 2014
Citations: 5

Similar Papers

Acoustic and lexical resource constrained ASR using language-independent acoustic model and language-dependent probabilistic lexical model
Ramya Rasipuram ... Mathew Magimai-Doss
Speech Communication | VOL. 68
Ramya Rasipuram, et. al.Ramya Rasipuram ... Mathew Magimai-Doss
29 Dec 2015
Speech Communication | VOL. 68

Interaction between people with dysarthria and speech recognition systems: A review
Aisha Jaddoh ... Omer Rana
Assistive Technology | VOL. 35
Aisha Jaddoh, et. al.Aisha Jaddoh ... Omer Rana
16 Apr 2022
Assistive Technology | VOL. 35

Native Language Identification from Spoken Indian English
...
Trends in Electrical Engineering | VOL. 9
, et. al. ...
30 Oct 2019
Trends in Electrical Engineering | VOL. 9

Integrated pronunciation learning for automatic speech recognition using probabilistic lexical modeling
Ramya Rasipuram ... Mathew Magimai-Doss
-
Ramya Rasipuram, et. al.Ramya Rasipuram ... Mathew Magimai-Doss
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient data selection for ASR

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation