Back to Search Start Over

Deep neural networks for i-vector language identification of short utterances in cars

Authors :
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Ghahabi Esfahani, Omid
Bonafonte Cávez, Antonio
Hernando Pericás, Francisco Javier
Moreno Bilbao, M. Asunción
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Ghahabi Esfahani, Omid
Bonafonte Cávez, Antonio
Hernando Pericás, Francisco Javier
Moreno Bilbao, M. Asunción
Publication Year :
2016

Abstract

This paper is focused on the application of the Language Identification (LID) technology for intelligent vehicles. We cope with short sentences or words spoken in moving cars in four languages: English, Spanish, German, and Finnish. As the response time of the LID system is crucial for user acceptance in this particular task, speech signals of different durations with total average of 3.8s are analyzed. In this paper, the authors propose the use of Deep Neural Networks (DNN) to model effectively the i-vector space of languages. Both raw i-vectors and session variability compensated i-vectors are evaluated as input vectors to DNNs. The performance of the proposed DNN architecture is compared with both conventional GMM-UBM and i-vector/LDA systems considering the effect of durations of signals. It is shown that the signals with durations between 2 and 3s meet the requirements of this application, i.e., high accuracy and fast decision, in which the proposed DNN architecture outperforms GMM-UBM and i-vector/LDA systems by 37% and 28%, respectively.<br />Peer Reviewed<br />Postprint (published version)

Details

Database :
OAIster
Notes :
5 p., application/pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.ocn969841512
Document Type :
Electronic Resource