Back to Search Start Over

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Authors :
Mahendra Kumar Gourisaria
Rakshit Agrawal
Manoj Sahni
Pradeep Kumar Singh
Source :
Discover Internet of Things, Vol 4, Iss 1, Pp 1-23 (2024)
Publication Year :
2024
Publisher :
Springer, 2024.

Abstract

Abstract In the era of automated and digitalized information, advanced computer applications deal with a major part of the data that comprises audio-related information. Advancements in technology have ushered in a new era where cutting-edge devices can deliver comprehensive insights into audio content, leveraging sophisticated algorithms such such as Mel Frequency Cepstral Coefficients (MFCCs) and Short-Time Fourier Transform (STFT) to extract and provide pertinent information. Our study helps in not only efficient audio file management and audio file retrievals but also plays a vital role in security, the robotics industry, and investigations. Beyond its industrial applications, our model exhibits remarkable versatility in the corporate sector, particularly in tasks like siren sound detection and more. Embracing this capability holds the promise of catalyzing the development of advanced automated systems, paving the way for increased efficiency and safety across various corporate domains. The primary aim of our experiment is to focus on creating highly efficient audio classification models that can be seamlessly automated and deployed within the industrial sector, addressing critical needs for enhanced productivity and performance. Despite the dynamic nature of environmental sounds and the presence of noises, our presented audio classification model comes out to be efficient and accurate. The novelty of our research work reclines to compare two different audio datasets having similar characteristics and revolves around classifying the audio signals into several categories using various machine learning techniques and extracting MFCCs and STFTs features from the audio signals. We have also tested the results after and before the noise removal for analyzing the effect of the noise on the results including the precision, recall, specificity, and F1-score. Our experiment shows that the ANN model outperforms the other six audio models with the accuracy of 91.41% and 91.27% on respective datasets.

Details

Language :
English
ISSN :
27307239
Volume :
4
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Discover Internet of Things
Publication Type :
Academic Journal
Accession number :
edsdoj.5230b33e86949eeb59d6b5e666f5a82
Document Type :
article
Full Text :
https://doi.org/10.1007/s43926-023-00049-y