Back to Search Start Over

Keyword Search Based on Unsupervised Pre-Trained Acoustic Models

Authors :
Li, Xiner
Zhao, Jing
Zhang, Wei-Qiang
Lv, Zhiqiang
Huang, Shen
Source :
International Journal of Asian Language Processing; September 2021, Vol. 31 Issue: 3-4
Publication Year :
2021

Abstract

Speech keyword search (KWS) is the task of automatically detecting the required keywords in continuous speech. Single-keyword detection can be regarded as the task of speech keyword wake-up. For many practical applications of these small vocabulary speech recognition tasks, it is costly and unnecessary to build a full large vocabulary speech recognition system. For tasks related to speech keyword search, insufficiency in data resources remains the main challenge so far. Speech pre-training has become an effective technique, showing its superiority in a variety of tasks. The key idea is to learn effective representations in settings where a large amount of unlabeled data is available to improve the performance while labeled data of downstream tasks are limited. This research focuses on the combination of unsupervised pre-training and keyword search based on the Keyword-Filler model and introduces unsupervised pre-training into speech keyword search. The research selects pre-trained model architecture Wav2vec2.0 including XLSR. The research results show that training with feature extracted by pre-trained model performs better than the baseline. In the case of low-resource condition, the baseline performance drops significantly, while the performance of the pre-trained tuned model does not decrease but even increases slightly in some intervals. It can be seen that the pre-trained model can be tuned to achieve better performance on very little data. This shows the advantage and application value of keyword search based on unsupervised pre-training.

Details

Language :
English
ISSN :
27175545 and 2424791X
Volume :
31
Issue :
3-4
Database :
Supplemental Index
Journal :
International Journal of Asian Language Processing
Publication Type :
Periodical
Accession number :
ejs60623768
Full Text :
https://doi.org/10.1142/S2717554522500059