Start Over

On scaling contrastive representations for low-resource speech recognition

Authors :: Borgholt, Lasse
Tax, Tycho M.S.
Havtorn, Jakob D.
Maaløe, Lars
Igel, Christian
Borgholt, Lasse
Tax, Tycho M.S.
Havtorn, Jakob D.
Maaløe, Lars
Igel, Christian
Source :: Borgholt , L , Tax , T M S , Havtorn , J D , Maaløe , L & Igel , C 2021 , On scaling contrastive representations for low-resource speech recognition . in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . vol. 2021-June , IEEE , pp. 3885-3889 , 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 , Virtual, Toronto , Canada , 06/06/2021 .
Publication Year :: 2021
Abstract: Recent advances in self-supervised learning through contrastive training have shown that it is possible to learn a competitive speech recognition system with as little as 10 minutes of labeled data. However, these systems are computationally expensive since they require pre-training followed by fine-tuning in a large parameter space. We explore the performance of such systems without fine-tuning by training a stateof- the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2.0 framework. We find performance to decrease without fine-tuning and, in the extreme low-resource setting, wav2vec 2.0 is inferior to its predecessor. In addition, we find that wav2vec 2.0 representations live in a low dimensional subspace and that decorrelating the features of the representations can stabilize training of the automatic speech recognizer. Finally, we propose a bidirectional extension to the original wav2vec framework that consistently improves performance.

Details

Database :: OAIster
Journal :: Borgholt , L , Tax , T M S , Havtorn , J D , Maaløe , L & Igel , C 2021 , On scaling contrastive representations for low-resource speech recognition . in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . vol. 2021-June , IEEE , pp. 3885-3889 , 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 , Virtual, Toronto , Canada , 06/06/2021 .
Notes :: English
Publication Type :: Electronic Resource
Accession number :: edsoai.on1322768307
Document Type :: Electronic Resource