Back to Search Start Over

Exploration of Whisper fine-tuning strategies for low-resource ASR

Authors :
Yunpeng Liu
Xukui Yang
Dan Qu
Source :
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2024, Iss 1, Pp 1-11 (2024)
Publication Year :
2024
Publisher :
SpringerOpen, 2024.

Abstract

Abstract Limited data availability remains a significant challenge for Whisper’s low-resource speech recognition performance, falling short of practical application requirements. While previous studies have successfully reduced the recognition error rates of target language speech through fine-tuning, a comprehensive exploration and analysis of Whisper’s fine-tuning capabilities and the advantages and disadvantages of various fine-tuning strategies are still lacking. This paper aims to fill this gap by conducting comprehensive experimental exploration for Whisper’s low-resource speech recognition performance using five fine-tuning strategies with limited supervised data from seven low-resource languages. The results and analysis demonstrate that all fine-tuning strategies explored in this paper significantly enhance Whisper’s performance. However, different strategies vary in their suitability and practical effectiveness, highlighting the need for careful selection based on specific use cases and resources available.

Details

Language :
English
ISSN :
16874722
Volume :
2024
Issue :
1
Database :
Directory of Open Access Journals
Journal :
EURASIP Journal on Audio, Speech, and Music Processing
Publication Type :
Academic Journal
Accession number :
edsdoj.3508b4d3810e455288ed4cc38bb1d791
Document Type :
article
Full Text :
https://doi.org/10.1186/s13636-024-00349-3