Back to Search Start Over

Semiautomated generation of species-specific training data from large, unlabeled acoustic datasets for deep supervised birdsong isolation.

Authors :
Sasek, Justin
Allison, Brendan
Contina, Andrea
Knobles, David
Wilson, Preston
Keitt, Timothy
Source :
PeerJ; Sep2024, p1-22, 22p
Publication Year :
2024

Abstract

Background: Bioacoustic monitoring is an effective and minimally invasive method to study wildlife ecology. However, even the state-of-the-art techniques for analyzing birdsongs decrease in accuracy in the presence of extraneous signals such as anthropogenic noise and vocalizations of non-target species. Deep supervised source separation (DSSS) algorithms have been shown to effectively separate mixtures of animal vocalizations. However, in practice, recording sites also have site-specific variations and unique background audio that need to be removed, warranting the need for site-specific data. Methods: Here, we test the potential of training DSSS models on site-specific bird vocalizations and background audio. We used a semiautomated workflow using deep supervised classification and statistical cleaning to label and generate a site-specific source separation dataset by mixing birdsongs and background audio segments. Then, we trained a deep supervised source separation (DSSS) model with this generated dataset. Because most data is passively-recorded and consequently noisy, the true isolated birdsongs are unavailable which makes evaluation challenging. Therefore, in addition to using traditional source separation (SS) metrics, we also show the effectiveness of our site-specific approach using metrics commonly used in ornithological analyses such as automated feature labeling and species-specific trilateration accuracy. Results: Our approach of training on site-specific data boosts the source-to-distortion, source-to-interference, and source-to-artifact ratios (SDR, SIR, and SAR) by 9.33 dB, 24.07 dB, and 3.60 dB respectively. We also find our approach allows for automated feature labeling with single-digit mean absolute percent error and birdsong trilateration accuracy with a mean simulated trilateration error of 2.58 m. Conclusion: Overall, we show that site-specific DSSS is a promising upstream solution for wildlife audio analysis tools that break down in the presence of background noise. By training on site-specific data, our method is robust to unique, site-specific interference that caused previous methods to fail. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21678359
Database :
Complementary Index
Journal :
PeerJ
Publication Type :
Academic Journal
Accession number :
180705525
Full Text :
https://doi.org/10.7717/peerj.17854