Back to Search
Start Over
An end-to-end continuous Kannada ASR system under uncontrolled environment.
- Source :
- Multimedia Tools & Applications; Jan2024, Vol. 83 Issue 3, p7981-7994, 14p
- Publication Year :
- 2024
-
Abstract
- Achieving better speech recognition accuracy under real-time conditions is still a challenging task, and many researchers are striving to improve accuracy. In this paper, we developed a system for recognizing continuous Kannada speech sentences under real-time conditions. To develop the automatic speech recognition (ASR) models, we used task-specific continuous Kannada speech data gathered from speakers/farmers in real-time conditions. We designed an interactive voice response system (IVRS) and collected 40 continuous Kannada speech sentences. We transcribed, validated, and extracted speech features using the Mel frequency cepstral coefficient (MFCC) technique. We used 90% and 10% of validated continuous Kannada speech data for Kaldi system training and decoding, respectively, at different phoneme levels. The experimental results revealed that the time delay neural networks (TDNN) based ASR models outperformed ASR models of other acoustic modelling techniques and the earlier developed deep neural networks (DNN)-hidden Markov model (HMM) based continuous Kannada ASR (CKASR) system. The least word error rate (WER) ASR models are used in developing the real-time end-to-end (E2E) CKASR system. We verified the developed E2ECKASR system by testing it with 550 speakers/farmers under real-time conditions. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13807501
- Volume :
- 83
- Issue :
- 3
- Database :
- Complementary Index
- Journal :
- Multimedia Tools & Applications
- Publication Type :
- Academic Journal
- Accession number :
- 174659674
- Full Text :
- https://doi.org/10.1007/s11042-023-15854-4