Back to Search
Start Over
Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification
- Source :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing. 30:414-428
- Publication Year :
- 2022
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2022.
-
Abstract
- This paper presents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.
- Subjects :
- Hyperparameter
Bayes estimator
Acoustics and Ultrasonics
Computer science
business.industry
Posterior probability
Pattern recognition
Context (language use)
Speaker recognition
Computational Mathematics
Bayes' theorem
ComputingMethodologies_PATTERNRECOGNITION
Computer Science (miscellaneous)
Maximum a posteriori estimation
Artificial intelligence
Electrical and Electronic Engineering
business
Random variable
Subjects
Details
- ISSN :
- 23299304 and 23299290
- Volume :
- 30
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing
- Accession number :
- edsair.doi...........7e54b8faaf98e25e6aa97a2f5392d04a