Back to Search Start Over

Voice pathology detection on spontaneous speech data using deep learning models.

Authors :
Farazi, Sahar
Shekofteh, Yasser
Source :
International Journal of Speech Technology; Sep2024, Vol. 27 Issue 3, p739-751, 13p
Publication Year :
2024

Abstract

Speech problems are a common issue that affects people everywhere and can affect the quality of their lives. The human speech production system involves various components. Dysfunction of any of these components can disrupt normal speech production, giving rise to speech diseases like laryngopharyngeal reflux, vocal cord paralysis, and vocal fold nodules. Early diagnosis of these disorders is very important for the patient's health. Many studies in automatic diagnosis of voice pathology have used sustained vowel sounds and read-speech as the primary source of speech data. However, it is crucial to recognize the unique value of spontaneous-speech. In addition to inheriting the characteristics of read speech, spontaneous-speech offers a more authentic glimpse into individuals' speech behavior. It captures not only linguistic features, but also subtle nuances of human emotions, such as fatigue and excitement, which may cause speech impairments, and shows their patterns in the speech signal better than in the read-speech data. Therefore, we aim to explore spontaneous speech in voice pathology detection to determine if it can help us better understand speech problems. In this research, we examine different deep learning (DL) models trained on two main features (MFCC and Mel spectrograms) for binary classification of healthy speech versus pathological speech, with a specific focus on the spontaneous speech. Extensive experimentation reveals the superiority of our proposed convolutional neural network (CNN) model trained on MFCC features. Notably, the CNN model achieves the highest accuracy, approximately 85% for test data and 92% for evaluation data. These results emphasize the potential of DL approaches in the accurate diagnosis of speech disorders through the analysis of the spontaneous-speech, offering promise for early detection and improved patient care. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13812416
Volume :
27
Issue :
3
Database :
Complementary Index
Journal :
International Journal of Speech Technology
Publication Type :
Academic Journal
Accession number :
179604701
Full Text :
https://doi.org/10.1007/s10772-024-10134-4