1. X-Vectors: new quantitative biomarkers for early Parkinson's disease detection from speech
- Author
-
Laetitia Jeancolas, Dijana Petrovska-Delacrétaz, Graziella Mangone, Badr-Eddine Benkelfat, Jean-Christophe Corvol, Marie Vidailhet, Stéphane Lehéricy, Habib Benali, Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Institut Polytechnique de Paris (IP Paris), Département Electronique et Physique (EPH), Institut Mines-Télécom [Paris] (IMT)-Télécom SudParis (TSP), ARMEDIA (ARMEDIA-SAMOVAR), Services répartis, Architectures, MOdélisation, Validation, Administration des Réseaux (SAMOVAR), Institut Mines-Télécom [Paris] (IMT)-Télécom SudParis (TSP)-Institut Mines-Télécom [Paris] (IMT)-Télécom SudParis (TSP), Service de Neuroradiologie [CHU Pitié-Salpêtrière], CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU), Traitement de l'Information Pour Images et Communications (TIPIC-SAMOVAR), Concordia University [Montreal], Jeancolas, Laetitia, Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), and Département Electronique et Physique (TSP - EPH)
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Computer science ,Speech recognition ,Parkinson's disease ,02 engineering and technology ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Quantitative Biology - Quantitative Methods ,Computer Science - Sound ,Voice analysis ,0302 clinical medicine ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Audio and Speech Processing (eess.AS) ,Deep neural networks ,Quantitative Methods (q-bio.QM) ,Original Research ,Artificial neural network ,Early detection ,Speaker recognition ,[INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD] ,Computer Science Applications ,MFCC ,[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD] ,[SDV.NEU]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC] ,Mel-frequency cepstrum ,Electrical Engineering and Systems Science - Audio and Speech Processing ,X-vectors ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Microphone ,0206 medical engineering ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Biomedical Engineering ,Neuroscience (miscellaneous) ,Context (language use) ,lcsh:RC321-571 ,03 medical and health sciences ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Classifier (linguistics) ,FOS: Electrical engineering, electronic engineering, information engineering ,[SDV.NEU] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC] ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Telediagnosis ,Telephone network ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,020601 biomedical engineering ,ComputingMethodologies_PATTERNRECOGNITION ,FOS: Biological sciences ,Automatic detection ,030217 neurology & neurosurgery ,Neuroscience - Abstract
Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect an early stage of PD from voice analysis. X-vectors are embeddings extracted from a deep neural network, which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients - Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (including recently diagnosed PD subjects and healthy controls) with a high-quality microphone and with their own telephone. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of audio segment duration, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7 to 15% improvement). This result was observed for both recording types (high-quality microphone and telephone)., Submitted
- Published
- 2021
- Full Text
- View/download PDF