1. 基于多任务损失附加语言模型的语音识别方法.
- Author
-
柳永利, 张绍阳, 王裕恒, and 解熠
- Subjects
- *
LANGUAGE models , *SPEECH perception , *PROBLEM solving , *LANGUAGE attrition , *ERROR rates , *DEEP learning - Abstract
To solve the problems that the Attention's overly flexible alignment was poorly adaptable in complex environments and the language features were not fully utilized by simple end-to-end models, a speech recognition method was investigated based on multi-task loss with additional language model. By analyzing the characteristics of the speech signal, the features containing more information were selected in the training. Based on the Attention-based Conformer end-to-end model, the model was trained using multi-task loss of CTC loss assisted pure Conformer (Attention), and the Conformer-CTC speech recognition model was obtained. Based on the Conformer-CTC model, by analyzing and comparing the characteristics and effects of some language models, the Transformer language model was added to the training of the above model through re-scoring mechanism, and the Conformer-CTC-Transformer speech recognition model was obtained. The experiments on the above model were completed on the AISHELL-1 data set. The results show that compared with the pure Conformer (Attention) model, the character error rate (CER) of the Conformer-CTC model on the test set is reduced by 0. 49%, and the CER of the Conformer-CTC-Transformer model on the test set is reduced by 0.79% compared with the Conformer- CTC model. The adaptability of Attention alignment in complex environments can be improved by CTC loss, and after re-scoring the Transformer-CTC model with the Transformer language model, the recognition accuracy can be increased by 0. 30% again. Compared with some existing end-to-end models, the recognition effect of the Conformer-CTC-Transformer model is better, indicating that the model has certain effectiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF