Back to Search Start Over

Towards the topology of autoencoder of calls versus clicks of marine mammal

Authors :
Maxence Ferrari
Faicel Chamroukhi
Hervé Glotin
Vincent Roger
Ricard Marxer
Source :
The Journal of the Acoustical Society of America. 144:1777-1778
Publication Year :
2018
Publisher :
Acoustical Society of America (ASA), 2018.

Abstract

The goal is to learn the features and the representation adapted for cetacean sound dynamics without any priors. Thus, we develop data driven model to generate voicing and click of cetaceans audio signals. We learn representation and features of stationary or nonstationary emission using neural network from raw audio. We use different types of convolutions (causal, with strides, with dilation [1]), or gradient inversion [2]. Experiments are conducted on various kind of calls of humpback whales from nips4b challenge [3] or Orca whale. We compare the topology for transient encoding on Physeters and Inia g. For each model, we detail the resulting filters and discuss on the topology. We acknowledge Region PACA and NortekMED for Roger’s Phd grant, & DGA and Region Haut de France for Ferrari’s Phd grant. [1] Oord, Dieleman, Zen, Simonyan, Vinyals, Graves et al. Wavenet : A generative model for raw audio, arXiv:1609.03499, 2016 [2] Balestriero, Roger, Glotin, Baraniuk, Semi-Supervised Learning via New Deep Network Inversion, arXiv:1711.04313, 2017 [3] Glotin, LeCun, Mallat et al. Proc. 1st wkp on Neural Information Processing for Bioacoustics NIPS4B, joint to NIPS Alberta USA, 2013 http://sabiod.org/nips4b/challenge2.html, http://sabiod.org/NIPS4B2013_book.pdfThe goal is to learn the features and the representation adapted for cetacean sound dynamics without any priors. Thus, we develop data driven model to generate voicing and click of cetaceans audio signals. We learn representation and features of stationary or nonstationary emission using neural network from raw audio. We use different types of convolutions (causal, with strides, with dilation [1]), or gradient inversion [2]. Experiments are conducted on various kind of calls of humpback whales from nips4b challenge [3] or Orca whale. We compare the topology for transient encoding on Physeters and Inia g. For each model, we detail the resulting filters and discuss on the topology. We acknowledge Region PACA and NortekMED for Roger’s Phd grant, & DGA and Region Haut de France for Ferrari’s Phd grant. [1] Oord, Dieleman, Zen, Simonyan, Vinyals, Graves et al. Wavenet : A generative model for raw audio, arXiv:1609.03499, 2016 [2] Balestriero, Roger, Glotin, Baraniuk, Semi-Supervised Learning via New Deep Netwo...

Details

ISSN :
00014966
Volume :
144
Database :
OpenAIRE
Journal :
The Journal of the Acoustical Society of America
Accession number :
edsair.doi...........b2ff09264db0811feaa467924791cc74
Full Text :
https://doi.org/10.1121/1.5067859