Back to Search Start Over

Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining

Authors :
LI Zhao-qi, LI Ta
Source :
Jisuanji kexue, Vol 49, Iss 1, Pp 59-64 (2022)
Publication Year :
2022
Publisher :
Editorial office of Computer Science, 2022.

Abstract

Query-by-Example is a popular keyword detection method in the absence of speech resources.It can build a keyword query system with excellent performance when there are few labeled voice resources and a lack of pronunciation dictionaries.In recent years,neural acoustic word embeddings has become a commonly used Query-by-Example method.In this paper,we propose to use wav2vec pre-training to optimize the neural acoustic word embeddings system,which is using bidirectional long short-term memory.On the data set extracted in SwitchBoard,the features extracted by the wav2vec model are directly used to replace the Mel frequency cepstral coefficient features,which relatively increases the system's average precision rate by 11.1% and precision recall break-even point by 10.0%.Subsequently,we tried some methods to fuse the wav2vec feature and Mel frequency cepstral coefficient feature to extract the embedding vector.The average precision rate and precision recall break-even point of the fusion method is a relative increase of 5.3% and 2.5% compared to the method using only wav2vec.

Details

Language :
Chinese
ISSN :
1002137X
Volume :
49
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Jisuanji kexue
Publication Type :
Academic Journal
Accession number :
edsdoj.2c25f9f9784342f29f7a81189527952e
Document Type :
article
Full Text :
https://doi.org/10.11896/jsjkx.210900007