Back to Search
Start Over
Avoiding dominance of speaker features in speech-based depression detection.
- Source :
-
Pattern Recognition Letters . Sep2023, Vol. 173, p50-56. 7p. - Publication Year :
- 2023
-
Abstract
- The performance of speech-based depression detectors is limited by the scarcity and imbalance in depression data. We found that depression detectors could be strongly biased toward speaker features when the number of training speakers is insufficient. To address this issue, we propose a speaker-invariant depression detector (SIDD) that minimizes speaker information in the latent space. The SIDD consists of an autoencoder, a depression classifier, and a speaker-embedding projector. By incorporating speaker-embedding vectors into the autoencoder's latent vectors, speaker information is effectively eliminated for the depression classifier. Experimental results demonstrate significant improvements achieved by minimizing speaker information, and our proposed method generally outperforms previous approaches for depression detection on the DAIC-WOZ dataset. • Depression models trained on small datasets will bias toward speaker features. • Use an autoencoder to minimize speaker information in a latent space. • Minimizing speaker information can improve depression detection performance. • A balanced number of training vectors from different speakers is critical. [ABSTRACT FROM AUTHOR]
- Subjects :
- *SOCIAL dominance
Subjects
Details
- Language :
- English
- ISSN :
- 01678655
- Volume :
- 173
- Database :
- Academic Search Index
- Journal :
- Pattern Recognition Letters
- Publication Type :
- Academic Journal
- Accession number :
- 171311686
- Full Text :
- https://doi.org/10.1016/j.patrec.2023.07.016