Back to Search
Start Over
Personalizing ASR for Dysarthric and Accented Speech with Limited Data
- Source :
- INTERSPEECH
- Publication Year :
- 2019
- Publisher :
- arXiv, 2019.
-
Abstract
- Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech: speech from people with amyotrophic lateral sclerosis (ALS) and accented speech. We train personalized models that achieve 62% and 35% relative WER improvement on these two groups, bringing the absolute WER for ALS speakers, on a test set of message bank phrases, down to 10% for mild dysarthria and 20% for more serious dysarthria. We show that 71% of the improvement comes from only 5 minutes of training data. Finetuning a particular subset of layers (with many fewer parameters) often gives better results than finetuning the entire model. This is the first step towards building state of the art ASR models for dysarthric speech.<br />Comment: 5 pages
- Subjects :
- FOS: Computer and information sciences
Computer Science - Machine Learning
Sound (cs.SD)
Computer Science - Computation and Language
Computer science
Speech recognition
020208 electrical & electronic engineering
02 engineering and technology
Computer Science - Sound
Machine Learning (cs.LG)
Dysarthria
Audio and Speech Processing (eess.AS)
0202 electrical engineering, electronic engineering, information engineering
medicine
FOS: Electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
medicine.symptom
Computation and Language (cs.CL)
Electrical Engineering and Systems Science - Audio and Speech Processing
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- INTERSPEECH
- Accession number :
- edsair.doi.dedup.....19e45d966b6918f53b1969171cded8f6
- Full Text :
- https://doi.org/10.48550/arxiv.1907.13511