1. Asteroid: the PyTorch-based audio source separation toolkit for researchers
- Author
-
Ariel Frank, Emmanuel Vincent, Fabian-Robert Stöter, Mathieu Hu, Joris Cosentino, Manuel Pariente, Samuele Cornell, Sunit Sivasankaran, David Ditter, Efthymios Tzinis, Juan M. Martín-Doñas, Antoine Deleforge, Michel Olvera, Jens Heitkaemper, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Università Politecnica delle Marche [Ancona] (UNIVPM), Department of Electrical and Computer Engineering [Urbana] (University of Illinois), University of Illinois at Urbana-Champaign [Urbana], University of Illinois System-University of Illinois System, University of Paderborn, Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Universidad de Granada = University of Granada (UGR), University of Hamburg, Technion - Israel Institute of Technology [Haifa], Grid'5000, Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), and University of Granada [Granada]
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Computer science ,open-source software ,020207 software engineering ,02 engineering and technology ,[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE] ,end-to-end ,Computer Science - Sound ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Speech enhancement ,Computer engineering ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Audio and Speech Processing (eess.AS) ,Asteroid ,source separation ,[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD] ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,020201 artificial intelligence & image processing ,speech enhancement ,Software architecture ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for researchers. Inspired by the most successful neural source separation systems, it provides all neural building blocks required to build such a system. To improve reproducibility, Kaldi-style recipes on common audio source separation datasets are also provided. This paper describes the software architecture of Asteroid and its most important features. By showing experimental results obtained with Asteroid's recipes, we show that our implementations are at least on par with most results reported in reference papers. The toolkit is publicly available at https://github.com/mpariente/asteroid ., Submitted to Interspeech 2020
- Published
- 2020