Back to Search Start Over

Investigation of acoustic and visual features for acoustic scene classification.

Authors :
Xie, Jie
Zhu, Mingying
Source :
Expert Systems with Applications. Jul2019, Vol. 126, p20-29. 10p.
Publication Year :
2019

Abstract

Highlights • Aggregate acoustic and visual features for acoustic scene classification. • Investigate three feature selection methods for acoustic scene classification. • Compare time-frequency representations for visual feature extraction. Abstract Acoustic scene classification has gained great interests in recent years due to its diverse applications. Various acoustic and visual features have been proposed and evaluated. However, few studies have investigated acoustic and visual feature aggregation for acoustic scene classification. In this paper, we investigated various feature sets based on the fusion of acoustic and visual features. Specifically, acoustic features are directly extracted from the waveform: spectral centroid, spectral entropy, spectral flux, spectral roll-off, short-time energy, zero-crossing rate, and Mel-frequency Cepstral coefficients. For visual features, we calculate local binary pattern, histogram of gradients, and moments based on the audio scene time-frequency representation. Then, three feature selection algorithms are applied to various feature sets to reduce feature dimensionality: correlation-based feature selection, principal component analysis, and ReliefF. Experimental results show that our proposed system was able to achieve an accuracy improvement of 15.43% compared to the baseline system with the development set. When all development sets are used for training, the performance based on the evaluation set provided by the TUT Acoustic scene 2016 challenge is 87.44%, which is the fourth best among all non-neural network systems. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
126
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
135532041
Full Text :
https://doi.org/10.1016/j.eswa.2019.01.085