Back to Search Start Over

Compensate multiple distortions for speaker recognition systems

Authors :
Mohammadamini, Mohammad
Matrouf, Driss
Bonastre, Jean-Francois
Serizel, Romain
Dowerah, Sandipana
Jouvet, Denis
Laboratoire Informatique d'Avignon (LIA)
Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Source :
EUSIPCO 2021-29th European Signal Processing Conference, EUSIPCO 2021-29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9615983⟩, EUSIPCO 2021, EUSIPCO 2021, Aug 2021, Dublin, Ireland
Publication Year :
2021
Publisher :
HAL CCSD, 2021.

Abstract

International audience; The performance of speaker recognition systems reduces dramatically in severe conditions in the presence of additive noise and/or reverberation. In some cases, there is only one kind of domain mismatch like additive noise or reverberation, but in many cases, there are more than one distortion. Finding a solution for domain adaptation in the presence of different distortions is a challenge. In this paper we investigate the situation in which there is none, one or more of the following distortions: early reverberation, full reverberation, additive noise. We propose two configurations to compensate for these distortions. In the first one a specific denoising autoencoder is used for each distortion. In the second configuration, a denoising autoencoder is used to compensate for all of these distortions simultaneously. Our experiments show that, in the coexistence of noise and reverberation, the second configuration gives better results. For example, with the second configuration we obtained 76.6% relative improvement of EER for utterances longer than 12 seconds. For other situations in the presence of only one distortion, the second configuration gives almost the same results achieved by using a specific model for each distortion.

Details

Language :
English
Database :
OpenAIRE
Journal :
EUSIPCO 2021-29th European Signal Processing Conference, EUSIPCO 2021-29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9615983⟩, EUSIPCO 2021, EUSIPCO 2021, Aug 2021, Dublin, Ireland
Accession number :
edsair.doi.dedup.....a5e61955bf2c8d91c5560c63f750f622
Full Text :
https://doi.org/10.23919/EUSIPCO54536.2021.9615983⟩