Back to Search
Start Over
High Accurate Environmental Sound Classification: Sub-Spectrogram Segmentation versus Temporal-Frequency Attention Mechanism
- Source :
- Sensors, Vol 21, Iss 5500, p 5500 (2021), Sensors, Volume 21, Issue 16, Sensors (Basel, Switzerland)
- Publication Year :
- 2021
- Publisher :
- MDPI AG, 2021.
-
Abstract
- In the important and challenging field of environmental sound classification (ESC), a crucial and even decisive factor is the feature representation ability, which can directly affect the accuracy of classification. Therefore, the classification performance often depends to a large extent on whether the effective representative features can be extracted from the environmental sound. In this paper, we firstly propose a sub-spectrogram segmentation with score level fusion based ESC classification framework, and we adopt the proposed convolutional recurrent neural network (CRNN) for improving the classification accuracy. By evaluating numerous truncation schemes, we numerically figure out the optimal number of sub-spectrograms and the corresponding band ranges, and, on this basis, we propose a joint attention mechanism with temporal and frequency attention mechanisms and use the global attention mechanism when generating the attention map. Finally, the numerical results show that the two frameworks we proposed can achieve 82.1% and 86.4% classification accuracy on the public environmental sound dataset ESC-50, respectively, which is equivalent to more than 13.5% improvement over the traditional baseline scheme.
- Subjects :
- Computer science
TP1-1185
Biochemistry
Article
Field (computer science)
Analytical Chemistry
Feature (machine learning)
Segmentation
Truncation (statistics)
Electrical and Electronic Engineering
Representation (mathematics)
Instrumentation
environmental sound classification
Basis (linear algebra)
business.industry
Chemical technology
Pattern recognition
Atomic and Molecular Physics, and Optics
Sound
Recurrent neural network
Spectrogram
Neural Networks, Computer
Artificial intelligence
sub-spectrogram segmentation
business
score level fusion
convolutional recurrent neural network
temporal-frequency attention mechanism
Subjects
Details
- Language :
- English
- ISSN :
- 14248220
- Volume :
- 21
- Issue :
- 5500
- Database :
- OpenAIRE
- Journal :
- Sensors
- Accession number :
- edsair.doi.dedup.....858de5d7e2ff3d08d2e380c26375e0d5