Exploration of Whisper fine-tuning strategies for low-resource ASR

Yunpeng Liu,Xukui Yang,Dan Qu

doi:10.1186/s13636-024-00349-3

Abstract

Limited data availability remains a significant challenge for Whisper’s low-resource speech recognition performance, falling short of practical application requirements. While previous studies have successfully reduced the recognition error rates of target language speech through fine-tuning, a comprehensive exploration and analysis of Whisper’s fine-tuning capabilities and the advantages and disadvantages of various fine-tuning strategies are still lacking. This paper aims to fill this gap by conducting comprehensive experimental exploration for Whisper’s low-resource speech recognition performance using five fine-tuning strategies with limited supervised data from seven low-resource languages. The results and analysis demonstrate that all fine-tuning strategies explored in this paper significantly enhance Whisper’s performance. However, different strategies vary in their suitability and practical effectiveness, highlighting the need for careful selection based on specific use cases and resources available.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploration of Whisper fine-tuning strategies for low-resource ASR

Abstract

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Jun 1, 2024
License type: cc-by

Similar Papers

COMPARATIVE ANALYSIS OF MULTILINGUAL QA MODELS AND THEIR ADAPTATION TO THE KAZAKH LANGUAGE
Arailym Tleubayeva ... Aday Shomanov
Scientific Journal of Astana IT University | VOL. -
Arailym Tleubayeva, et. al.Arailym Tleubayeva ... Aday Shomanov
30 Sep 2024
Scientific Journal of Astana IT University | VOL. -

Multi-task Learning of Deep Neural Networks for Low-resource Speech Recognition
Dongpeng Chen ... Brian Mak
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 23
Dongpeng Chen, et. al.Dongpeng Chen ... Brian Mak
01 Jan 2015
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 23

The Effects of Practice Modality on Pragmatic Development in L2 Chinese
Shuai Li ... Naoko Taguchi
The Modern Language Journal | VOL. 98
Shuai Li, et. al.Shuai Li ... Naoko Taguchi
01 Sep 2014
The Modern Language Journal | VOL. 98

A Method Improves Speech Recognition with Contrastive Learning in Low-Resource Languages
Lixu Sun ... Lina Jiang
Applied Sciences | VOL. 13
Lixu Sun, et. al.Lixu Sun ... Lina Jiang
12 Apr 2023
Applied Sciences | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploration of Whisper fine-tuning strategies for low-resource ASR

Abstract

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing