Back to Search Start Over

Bangla Short Speech Commands Recognition Using Convolutional Neural Networks

Authors :
Shakil Ahmed Sumon
Joydip Chowdhury
Sujit Debnath
Nabeel Mohammed
Sifat Momen
Source :
2018 International Conference on Bangla Speech and Language Processing (ICBSLP).
Publication Year :
2018
Publisher :
IEEE, 2018.

Abstract

Despite being one of the most widely spoken languages of the world, no significant efforts have been made in Bangla speech recognition. Speech recognition is a difficult task, particularly if the demand is to do so in noisy real-life conditions. In this study, Bangla short speech commands data set has been reported, where all the samples are taken in the real-life setting. Three different convolutional neural network (CNN) architectures have been designed to recognize those short speech commands. Mel-frequency cepstral coefficients (MFCC) features have been extracted from the audio files in one approach whereas only the raw audio files have been used in another CNN architecture. Lastly, a pre-trained model which is trained on a large English short speech commands data set has been fine-tuned by retraining on Bangla data set. Experimental results reveal that the MFCC model shows better accuracy in recognizing Bangla short speech commands where, surprisingly, the model predicting on raw audio data is very competitive. The models have shown proficiency in identifying single syllable words but encounter difficulties in recognizing multi-syllable commands.

Details

Database :
OpenAIRE
Journal :
2018 International Conference on Bangla Speech and Language Processing (ICBSLP)
Accession number :
edsair.doi...........a6edb3c9dc7764b1a106114555b9bc95
Full Text :
https://doi.org/10.1109/icbslp.2018.8554395