基于特征提示的跨语种语音识别模型.

Authors :: 王嘉文
 高定国
 索朗曲珍
 尼琼
Source :: Science Technology & Engineering. 2024, Vol. 24 Issue 24, p10348-10355. 8p.
Publication Year :: 2024
Abstract: Cross-lingual speech recognition leverages data from a variety of source languages to train systems capable of identifying speech in a target language, thus promoting intercultural communication and understanding. To address the issues of how to utilize multilingual data to improve the recognition performance of low resource languages in cross-lingual speech recognition, domain shift or interference between source and target languages, task weights and data distribution between different languages, a cross lingual speech recognition model was studied through feature prompts. To simplify the traditional process of requiring professionals to label phonemes uniformly, a cross-lingual speech data annotation method was studied by identifying the corresponding language in the original data, and experiments were conducted on two public datasets. The results show that the proposed model achieves a substantial reduction in the average error rate 46.44% lower than the Conformer model, a mainstream speech recognition model, and 2.1% lower than the baseline model, thereby attaining higher accuracy in recognition. The research results provide novel perspectives and methodologies for the domain of cross-lingual speech recognition. [ABSTRACT FROM AUTHOR]