Start Over

Cross-Modal Variational Inference For Bijective Signal-Symbol Translation

Authors :: Chemla--Romeu-Santos, Axel
Ntalampiras, Stavros
Esling, Philippe
Haus, Goffredo
Assayag, G��rard
Représentations musicales (Repmus)
Sciences et Technologies de la Musique et du Son (STMS)
Institut de Recherche et Coordination Acoustique/Musique (IRCAM)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche et Coordination Acoustique/Musique (IRCAM)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
Laboratorio d'Informatica Musicale (LIM)
Università degli Studi di Milano [Milano] (UNIMI)
Source :: Proceedings of the 22 nd International Conference on Digital Audio Effects (DAFx-19), Proceedings of the 22 nd International Conference on Digital Audio Effects (DAFx-19), 2019
Publication Year :: 2019
Publisher :: HAL CCSD, 2019.
Abstract: Extraction of symbolic information from signals is an active field of research enabling numerous applications especially in the Musical Information Retrieval domain. This complex task, that is also related to other topics such as pitch extraction or instrument recognition, is a demanding subject that gave birth to numerous approaches, mostly based on advanced signal processing-based algorithms. However, these techniques are often non-generic, allowing the extraction of definite physical properties of the signal (pitch, octave), but not allowing arbitrary vocabularies or more general annotations. On top of that, these techniques are one-sided, meaning that they can extract symbolic data from an audio signal, but cannot perform the reverse process and make symbol-to-signal generation. In this paper, we propose an bijective approach for signal/symbol translation by turning this problem into a density estimation task over signal and symbolic domains, considered both as related random variables. We estimate this joint distribution with two different variational auto-encoders, one for each domain, whose inner representations are forced to match with an additive constraint, allowing both models to learn and generate separately while allowing signal-to-symbol and symbol-to-signal inference. In this article, we test our models on pitch, octave and dynamics symbols, which comprise a fundamental step towards music transcription and label-constrained audio generation. In addition to its versatility, this system is rather light during training and generation while allowing several interesting creative uses that we outline at the end of the article.<br />Proceedings of the 22nd International Conference on Digital Audio Effects (DAFx-2019)

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Statistics - Machine Learning
Machine Learning (stat.ML)
[INFO]Computer Science [cs]
Machine Learning (cs.LG)
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]

Details

Language :: English
Database :: OpenAIRE
Journal :: Proceedings of the 22 nd International Conference on Digital Audio Effects (DAFx-19), Proceedings of the 22 nd International Conference on Digital Audio Effects (DAFx-19), 2019
Accession number :: edsair.doi.dedup.....6412a13c8abcf6e9efbaad10068c8087

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Cross-Modal Variational Inference For Bijective Signal-Symbol Translation

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Cross-Modal Variational Inference For Bijective Signal-Symbol Translation

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources