Back to Search
Start Over
Voice pathology detection on spontaneous speech data using deep learning models.
- Source :
- International Journal of Speech Technology; Sep2024, Vol. 27 Issue 3, p739-751, 13p
- Publication Year :
- 2024
-
Abstract
- Speech problems are a common issue that affects people everywhere and can affect the quality of their lives. The human speech production system involves various components. Dysfunction of any of these components can disrupt normal speech production, giving rise to speech diseases like laryngopharyngeal reflux, vocal cord paralysis, and vocal fold nodules. Early diagnosis of these disorders is very important for the patient's health. Many studies in automatic diagnosis of voice pathology have used sustained vowel sounds and read-speech as the primary source of speech data. However, it is crucial to recognize the unique value of spontaneous-speech. In addition to inheriting the characteristics of read speech, spontaneous-speech offers a more authentic glimpse into individuals' speech behavior. It captures not only linguistic features, but also subtle nuances of human emotions, such as fatigue and excitement, which may cause speech impairments, and shows their patterns in the speech signal better than in the read-speech data. Therefore, we aim to explore spontaneous speech in voice pathology detection to determine if it can help us better understand speech problems. In this research, we examine different deep learning (DL) models trained on two main features (MFCC and Mel spectrograms) for binary classification of healthy speech versus pathological speech, with a specific focus on the spontaneous speech. Extensive experimentation reveals the superiority of our proposed convolutional neural network (CNN) model trained on MFCC features. Notably, the CNN model achieves the highest accuracy, approximately 85% for test data and 92% for evaluation data. These results emphasize the potential of DL approaches in the accurate diagnosis of speech disorders through the analysis of the spontaneous-speech, offering promise for early detection and improved patient care. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13812416
- Volume :
- 27
- Issue :
- 3
- Database :
- Complementary Index
- Journal :
- International Journal of Speech Technology
- Publication Type :
- Academic Journal
- Accession number :
- 179604701
- Full Text :
- https://doi.org/10.1007/s10772-024-10134-4