Back to Search Start Over

Deep joint learning for language recognition.

Authors :
Li, Lin
Li, Zheng
Liu, Yan
Hong, Qingyang
Source :
Neural Networks. Sep2021, Vol. 141, p72-86. 15p.
Publication Year :
2021

Abstract

Deep learning methods for language recognition have achieved promising performance. However, most of the studies focus on frameworks for single types of acoustic features and single tasks. In this paper, we propose the deep joint learning strategies based on the Multi-Feature (MF) and Multi-Task (MT) models. First, we investigate the efficiency of integrating multiple acoustic features and explore two kinds of training constraints, one is introducing auxiliary classification constraints with adaptive weights for loss functions in feature encoder sub-networks, and the other option is introducing the Canonical Correlation Analysis (CCA) constraint to maximize the correlation of different feature representations. Correlated speech tasks, such as phoneme recognition, are applied as auxiliary tasks in order to learn related information to enhance the performance of language recognition. We analyze phoneme-aware information from different learning strategies, like joint learning on the frame-level, adversarial learning on the segment-level, and the combination mode. In addition, we present the Language-Phoneme embedding extraction structure to learn and extract language and phoneme embedding representations simultaneously. We demonstrate the effectiveness of the proposed approaches with experiments on the Oriental Language Recognition (OLR) data sets. Experimental results indicate that joint learning on the multi-feature and multi-task models extracts instinct feature representations for language identities and improves the performance, especially in complex challenges, such as cross-channel or open-set conditions. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08936080
Volume :
141
Database :
Academic Search Index
Journal :
Neural Networks
Publication Type :
Academic Journal
Accession number :
151634270
Full Text :
https://doi.org/10.1016/j.neunet.2021.03.026