1. Detecting and reducing heterogeneity of error in acoustic classification
- Author
-
Oliver C. Metcalf, Jos Barlow, Yves Bas, Erika Berenguer, Christian Devenish, Filipe França, Stuart Marsden, Charlotte Smith, Alexander C. Lees, Manchester Metropolitan University (MMU), Lancaster University, Centre d'Ecologie et des Sciences de la COnservation (CESCO), Muséum national d'Histoire naturelle (MNHN)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Centre d’Ecologie Fonctionnelle et Evolutive (CEFE), Université Paul-Valéry - Montpellier 3 (UPVM)-École Pratique des Hautes Études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche pour le Développement (IRD [France-Sud])-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Institut Agro Montpellier, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université de Montpellier (UM), University of Oxford, School of Biological Sciences [Bristol], and University of Bristol [Bristol]
- Subjects
autonomous recording unit ,bioacoustics ,ecoacoustics ,[SDE.IE]Environmental Sciences/Environmental Engineering ,Ecological Modeling ,machine-learning ,[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD] ,[PHYS.MECA.BIOM]Physics [physics]/Mechanics [physics]/Biomechanics [physics.med-ph] ,automated signal recognition ,Ecology, Evolution, Behavior and Systematics ,[PHYS.MECA.ACOU]Physics [physics]/Mechanics [physics]/Acoustics [physics.class-ph] - Abstract
Passive acoustic monitoring can be an effective method for monitoring species, allowing the assembly of large audio datasets, removing logistical constraints in data collection and reducing anthropogenic monitoring disturbances. However, the analysis of large acoustic datasets is challenging and fully automated machine learning processes are rarely developed or implemented in ecological field studies. One of the greatest uncertainties hindering the development of these methods is spatial generalisability—can an algorithm trained on data from one place be used elsewhere?We demonstrate that heterogeneity of error across space is a problem that could go undetected using common classification accuracy metrics. Second, we develop a method to assess the extent of heterogeneity of error in a random forest classification model for six Amazonian bird species. Finally, we propose two complementary ways to reduce heterogeneity of error, by (i) accounting for it in the thresholding process and (ii) using a secondary classifier that uses contextual data.We found that using a thresholding approach that accounted for heterogeneity of precision error reduced the coefficient of variation of the precision score from a mean of 0.61 ± 0.17 (SD) to 0.41 ± 0.25 in comparison to the initial classification with threshold selection based on F-score. The use of a secondary, contextual classification with thresholding selection accounting for heterogeneity of precision reduced it further still, to 0.16 ± 0.13, and was significantly lower than the initial classification in all but one species. Mean average precision scores increased, from 0.66 ± 0.4 for the initial classification, to 0.95 ± 0.19, a significant improvement for all species.We recommend assessing—and if necessary correcting for—heterogeneity of precision error when using automated classification on acoustic data to quantify species presence as a function of an environmental, spatial or temporal predictor variable.
- Published
- 2022