Back to Search Start Over

Ensemble of convolutional neural networks to improve animal audio classification

Authors :
Loris Nanni
Yandre M. G. Costa
Rafael L. Aguiar
Rafael B. Mangolin
Sheryl Brahnam
Carlos N. Silla
Source :
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2020, Iss 1, Pp 1-14 (2020)
Publication Year :
2020
Publisher :
SpringerOpen, 2020.

Abstract

Abstract In this work, we present an ensemble for automated audio classification that fuses different types of features extracted from audio files. These features are evaluated, compared, and fused with the goal of producing better classification accuracy than other state-of-the-art approaches without ad hoc parameter optimization. We present an ensemble of classifiers that performs competitively on different types of animal audio datasets using the same set of classifiers and parameter settings. To produce this general-purpose ensemble, we ran a large number of experiments that fine-tuned pretrained convolutional neural networks (CNNs) for different audio classification tasks (bird, bat, and whale audio datasets). Six different CNNs were tested, compared, and combined. Moreover, a further CNN, trained from scratch, was tested and combined with the fine-tuned CNNs. To the best of our knowledge, this is the largest study on CNNs in animal audio classification. Our results show that several CNNs can be fine-tuned and fused for robust and generalizable audio classification. Finally, the ensemble of CNNs is combined with handcrafted texture descriptors obtained from spectrograms for further improvement of performance. The MATLAB code used in our experiments will be provided to other researchers for future comparisons at https://github.com/LorisNanni .

Details

Language :
English
ISSN :
16874722
Volume :
2020
Issue :
1
Database :
Directory of Open Access Journals
Journal :
EURASIP Journal on Audio, Speech, and Music Processing
Publication Type :
Academic Journal
Accession number :
edsdoj.24db37327d840a28501f69b6e1a78fb
Document Type :
article
Full Text :
https://doi.org/10.1186/s13636-020-00175-3