Back to Search Start Over

Contextual Biasing for End-to-End Chinese ASR

Authors :
Kai Zhang
Qiuxia Zhang
Chung-Che Wang
Jyh-Shing Roger Jang
Source :
IEEE Access, Vol 12, Pp 92960-92975 (2024)
Publication Year :
2024
Publisher :
IEEE, 2024.

Abstract

The end-to-end speech recognition approach exhibits higher robustness compared to conventional methods, enhancing recognition accuracy across diverse contexts. However, due to the absence of an independent language model, it struggles to identify vocabulary beyond the training data, thus impacting the recognition of certain specific terms. Adapting to various scenarios necessitates a pivot towards specific domains. This study, based on the CATSLU dataset, constructed two tasks for Chinese contextual biasing, targeting both proper nouns and mixed-domain sentences. Additionally, it explored four methods of contextual biasing at different stages within the speech recognition process: pre-recognition, within the model, decoding, and post-processing stages. Experimental results indicate that all biasing methods to some extent improved the recognition efficacy of the speech recognition model within specific domains.

Details

Language :
English
ISSN :
21693536
Volume :
12
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.67a44e020cbc4ecd83f24398cc443137
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2024.3424260