Descriptor: "Speech recognition" / Database: Academic Search Index / Journal: journal of jiangsu university (natural science edition) / jiangsu daxue xuebao (ziran kexue ban) / Publication Year Range: Last 10 years / Topic: speech perception - Searchworks@Jio Institute Digital Library Search Results

1. 基于多任务损失附加语言模型的语音识别方法.

Author: 柳永利, 张绍阳, 王裕恒, and 解熠
Subjects: *LANGUAGE models, *SPEECH perception, *PROBLEM solving, *LANGUAGE attrition, *ERROR rates, *DEEP learning
Abstract: To solve the problems that the Attention's overly flexible alignment was poorly adaptable in complex environments and the language features were not fully utilized by simple end-to-end models, a speech recognition method was investigated based on multi-task loss with additional language model. By analyzing the characteristics of the speech signal, the features containing more information were selected in the training. Based on the Attention-based Conformer end-to-end model, the model was trained using multi-task loss of CTC loss assisted pure Conformer (Attention), and the Conformer-CTC speech recognition model was obtained. Based on the Conformer-CTC model, by analyzing and comparing the characteristics and effects of some language models, the Transformer language model was added to the training of the above model through re-scoring mechanism, and the Conformer-CTC-Transformer speech recognition model was obtained. The experiments on the above model were completed on the AISHELL-1 data set. The results show that compared with the pure Conformer (Attention) model, the character error rate (CER) of the Conformer-CTC model on the test set is reduced by 0. 49%, and the CER of the Conformer-CTC-Transformer model on the test set is reduced by 0.79% compared with the Conformer- CTC model. The adaptability of Attention alignment in complex environments can be improved by CTC loss, and after re-scoring the Transformer-CTC model with the Transformer language model, the recognition accuracy can be increased by 0. 30% again. Compared with some existing end-to-end models, the recognition effect of the Conformer-CTC-Transformer model is better, indicating that the model has certain effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Speech recognition"'

1. 基于多任务损失附加语言模型的语音识别方法.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Database

1 results on '"Speech recognition"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources