Author: "Defossez A" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Defossez A"' showing total 1,650 results

Start Over Author "Defossez A"

1,650 results on '"Defossez A"'

1. Moshi: a speech-text foundation model for real-time dialogue

Author: Défossez, Alexandre, Mazaré, Laurent, Orsini, Manu, Royer, Amélie, Pérez, Patrick, Jégou, Hervé, Grave, Edouard, and Zeghidour, Neil
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Sound
Abstract: We introduce Moshi, a speech-text foundation model and full-duplex spoken dialogue framework. Current systems for spoken dialogue rely on pipelines of independent components, namely voice activity detection, speech recognition, textual dialogue and text-to-speech. Such frameworks cannot emulate the experience of real conversations. First, their complexity induces a latency of several seconds between interactions. Second, text being the intermediate modality for dialogue, non-linguistic information that modifies meaning -- such as emotion or non-speech sounds -- is lost in the interaction. Finally, they rely on a segmentation into speaker turns, which does not take into account overlapping speech, interruptions and interjections. Moshi solves these independent issues altogether by casting spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. We moreover extend the hierarchical semantic-to-acoustic token generation of previous work to first predict time-aligned text tokens as a prefix to audio tokens. Not only this "Inner Monologue" method significantly improves the linguistic quality of generated speech, but we also illustrate how it can provide streaming speech recognition and text-to-speech. Our resulting model is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice, and is available at https://github.com/kyutai-labs/moshi.
Published: 2024

2. Audio Conditioning for Music Generation via Discrete Bottleneck Features

Author: Rouard, Simon, Adi, Yossi, Copet, Jade, Roebel, Axel, and Défossez, Alexandre
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: While most music generation models use textual or parametric conditioning (e.g. tempo, harmony, musical genre), we propose to condition a language model based music generation system with audio input. Our exploration involves two distinct strategies. The first strategy, termed textual inversion, leverages a pre-trained text-to-music model to map audio input to corresponding "pseudowords" in the textual embedding space. For the second model we train a music language model from scratch jointly with a text conditioner and a quantized audio feature extractor. At inference time, we can mix textual and audio conditioning and balance them thanks to a novel double classifier free guidance method. We conduct automatic and human studies that validates our approach. We will release the code and we provide music samples on https://musicgenstyle.github.io in order to show the quality of our model., Comment: 6 pages, 2 figures, accepted at ISMIR 2024
Published: 2024

3. An Independence-promoting Loss for Music Generation with Language Models

Author: Lemercier, Jean-Marie, Rouard, Simon, Copet, Jade, Adi, Yossi, and Défossez, Alexandre
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Music generation schemes using language modeling rely on a vocabulary of audio tokens, generally provided as codes in a discrete latent space learnt by an auto-encoder. Multi-stage quantizers are often employed to produce these tokens, therefore the decoding strategy used for token prediction must be adapted to account for multiple codebooks: either it should model the joint distribution over all codebooks, or fit the product of the codebook marginal distributions. Modelling the joint distribution requires a costly increase in the number of auto-regressive steps, while fitting the product of the marginals yields an inexact model unless the codebooks are mutually independent. In this work, we introduce an independence-promoting loss to regularize the auto-encoder used as the tokenizer in language models for music generation. The proposed loss is a proxy for mutual information based on the maximum mean discrepancy principle, applied in reproducible kernel Hilbert spaces. Our criterion is simple to implement and train, and it is generalizable to other multi-stream codecs. We show that it reduces the statistical dependence between codebooks during auto-encoding. This leads to an increase in the generated music quality when modelling the product of the marginal distributions, while generating audio much faster than the joint distribution model., Comment: Accepted to ICML 2024
Published: 2024

4. Cartographie de l'habitat de reproduction du t\'etras-lyre (Lyrurus tetrix) dans les Alpes fran\c{c}aises

Author: Defossez, Alexandre, Alleaume, Samuel, Montadert, Marc, Ienco, Dino, and Luque, Sandra
Subjects: Quantitative Biology - Populations and Evolution
Abstract: The Black Grouse (Lyrurus tetrix) is an emblematic alpine species with high conservation importance. The population size of these mountain bird tends to decline on the reference sites and shows differences according to changes in local landscape characteristics. Habitat changes are at the centre of the identified pressures impacting part or all of its life cycle, according to experts. Hence, an approach to monitor population dynamics, is trough modelling the favourable habitats of Black Grouse breeding (nesting sites). Then, coupling modelling with multi-source remote sensing data (medium and very high spatial resolution), allowed the implementation of a spatial distribution model of the species. Indeed, the extraction of variables from remote sensing helped to describe the area studied at appropriate spatial and temporal scales: horizontal and vertical structure (heterogeneity), functioning (vegetation indices), phenology (seasonal or inter-annual dynamics) and biodiversity. An annual time series of radiometric indices (NDVI, NDWI, BI {\ldots}) from Sentinel-2 has made it possible to generate Dynamic Habitat Indices (DHIs) to derive phenological indications on the nature and dynamics of natural habitats. In addition, very high resolution images (SPOT6) provided access to the fine structure of natural habitats, i.e. the vertical and horizontal organisation by states identified as elementary (mineral, herbaceous, low and high woody). Indeed, one of the essential limiting factors for brood rearing is the presence of a well-developed herbaceous or ericaceous stratum in the northern Alps and larch forests in the southern region. A deep learning model was used to classify elementary strata. Finally, Biomod2 R platform, using an ensemble approach, was applied to model, the favourable habitat of Black Grouse reproduction. Of all the models, Random Forest and Extreme Boosted Gradient are the best performing, with TSS and ROC scores close to 1. For the SDM, we selected only Random Forest models (ensemble modelling) because of their low susceptibility to overfitting and coherent predictions (after comparing model predictions).In this ensemble model, the most important explanatory variables are altitude, the proportion of heathland, and the DHI (NDVI Max and NDWI Max). Results from the habitat model can be used as an operational tool for monitoring forest landscape shifts and changes. In addition, to delimiting potential areas to protect the species habitat, which constitute a valuable decision-making tool for conservation management of mountain open forest., Comment: in French language
Published: 2024

5. Proactive Detection of Voice Cloning with Localized Watermarking

Author: Roman, Robin San, Fernandez, Pierre, Défossez, Alexandre, Furon, Teddy, Tran, Tuan, and Elsahar, Hady
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: In the rapidly evolving field of speech generative models, there is a pressing need to ensure audio authenticity against the risks of voice cloning. We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech. AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level, and a novel perceptual loss inspired by auditory masking, that enables AudioSeal to achieve better imperceptibility. AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics. Additionally, AudioSeal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed - achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications., Comment: Published at ICML 2024. Code at https://github.com/facebookresearch/audioseal - webpage at https://pierrefdz.github.io/publications/audioseal/
Published: 2024

6. Masked Audio Generation using a Single Non-Autoregressive Transformer

Author: Ziv, Alon, Gat, Itai, Lan, Gael Le, Remez, Tal, Kreuk, Felix, Défossez, Alexandre, Copet, Jade, Synnaeve, Gabriel, and Adi, Yossi
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We introduce MAGNeT, a masked generative sequence modeling method that operates directly over several streams of audio tokens. Unlike prior work, MAGNeT is comprised of a single-stage, non-autoregressive transformer. During training, we predict spans of masked tokens obtained from a masking scheduler, while during inference we gradually construct the output sequence using several decoding steps. To further enhance the quality of the generated audio, we introduce a novel rescoring method in which, we leverage an external pre-trained model to rescore and rank predictions from MAGNeT, which will be then used for later decoding steps. Lastly, we explore a hybrid version of MAGNeT, in which we fuse between autoregressive and non-autoregressive models to generate the first few seconds in an autoregressive manner while the rest of the sequence is being decoded in parallel. We demonstrate the efficiency of MAGNeT for the task of text-to-music and text-to-audio generation and conduct an extensive empirical evaluation, considering both objective metrics and human studies. The proposed approach is comparable to the evaluated baselines, while being significantly faster (x7 faster than the autoregressive baseline). Through ablation studies and analysis, we shed light on the importance of each of the components comprising MAGNeT, together with pointing to the trade-offs between autoregressive and non-autoregressive modeling, considering latency, throughput, and generation quality. Samples are available on our demo page https://pages.cs.huji.ac.il/adiyoss-lab/MAGNeT.
Published: 2024

7. Non-canonical functions of UHRF1 maintain DNA methylation homeostasis in cancer cells

Author: Yamaguchi, Kosuke, Chen, Xiaoying, Rodgers, Brianna, Miura, Fumihito, Bashtrykov, Pavel, Bonhomme, Frédéric, Salinas-Luypaert, Catalina, Haxholli, Deis, Gutekunst, Nicole, Aygenli, Bihter Özdemir, Ferry, Laure, Kirsh, Olivier, Laisné, Marthe, Scelfo, Andrea, Ugur, Enes, Arimondo, Paola B., Leonhardt, Heinrich, Kanemaki, Masato T., Bartke, Till, Fachinetti, Daniele, Jeltsch, Albert, Ito, Takashi, and Defossez, Pierre-Antoine
Published: 2024
Full Text: View/download PDF

8. Constructing Chronicity and Clouding Kairos: The Fragmentation of Temporal Dialectics in Descriptions of Chronic Depression

Author: Defossez, Ellen
Published: 2024

9. Integer Programming with GCD Constraints

Author: Defossez, Rémy, Haase, Christoph, Mansutti, Alessio, and Perez, Guillermo A.
Subjects: Computer Science - Logic in Computer Science, Mathematics - Number Theory
Abstract: We study the non-linear extension of integer programming with greatest common divisor constraints of the form $\gcd(f,g) \sim d$, where $f$ and $g$ are linear polynomials, $d$ is a positive integer, and $\sim$ is a relation among $\leq, =, \neq$ and $\geq$. We show that the feasibility problem for these systems is in NP, and that an optimal solution minimizing a linear objective function, if it exists, has polynomial bit length. To show these results, we identify an expressive fragment of the existential theory of the integers with addition and divisibility that admits solutions of polynomial bit length. It was shown by Lipshitz [Trans. Am. Math. Soc., 235, pp. 271-283, 1978] that this theory adheres to a local-to-global principle in the following sense: a formula $\Phi$ is equi-satisfiable with a formula $\Psi$ in this theory such that $\Psi$ has a solution if and only if $\Psi$ has a solution modulo every prime $p$. We show that in our fragment, only a polynomial number of primes of polynomial bit length need to be considered, and that the solutions modulo prime numbers can be combined to yield a solution to $\Phi$ of polynomial bit length. As a technical by-product, we establish a Chinese-remainder-type theorem for systems of congruences and non-congruences showing that solution sizes do not depend on the magnitude of the moduli of non-congruences.
Published: 2023

10. Code Llama: Open Foundation Models for Code

Author: Rozière, Baptiste, Gehring, Jonas, Gloeckle, Fabian, Sootla, Sten, Gat, Itai, Tan, Xiaoqing Ellen, Adi, Yossi, Liu, Jingyu, Sauvestre, Romain, Remez, Tal, Rapin, Jérémy, Kozhevnikov, Artyom, Evtimov, Ivan, Bitton, Joanna, Bhatt, Manish, Ferrer, Cristian Canton, Grattafiori, Aaron, Xiong, Wenhan, Défossez, Alexandre, Copet, Jade, Azhar, Faisal, Touvron, Hugo, Martin, Louis, Usunier, Nicolas, Scialom, Thomas, and Synnaeve, Gabriel
Subjects: Computer Science - Computation and Language
Abstract: We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama - Python), and instruction-following models (Code Llama - Instruct) with 7B, 13B, 34B and 70B parameters each. All models are trained on sequences of 16k tokens and show improvements on inputs with up to 100k tokens. 7B, 13B and 70B Code Llama and Code Llama - Instruct variants support infilling based on surrounding content. Code Llama reaches state-of-the-art performance among open models on several code benchmarks, with scores of up to 67% and 65% on HumanEval and MBPP, respectively. Notably, Code Llama - Python 7B outperforms Llama 2 70B on HumanEval and MBPP, and all our models outperform every other publicly available model on MultiPL-E. We release Code Llama under a permissive license that allows for both research and commercial use.
Published: 2023

11. The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

Author: Fabbro, Giorgio, Uhlich, Stefan, Lai, Chieh-Hsin, Choi, Woosung, Martínez-Ramírez, Marco, Liao, Weihsiang, Gadelha, Igor, Ramos, Geraldo, Hsu, Eddie, Rodrigues, Hugo, Stöter, Fabian-Robert, Défossez, Alexandre, Luo, Yi, Yu, Jianwei, Chakraborty, Dipam, Mohanty, Sharada, Solovyev, Roman, Stempkovskiy, Alexander, Habruseva, Tatiana, Goswami, Nabarun, Harada, Tatsuya, Kim, Minseok, Lee, Jun Hyung, Dong, Yuanliang, Zhang, Xinran, Liu, Jiafeng, and Mitsufuji, Yuki
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust music source separation (MSS), i.e., training MSS models in the presence of errors in the training data. We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce two new datasets that simulate such errors: SDXDB23_LabelNoise and SDXDB23_Bleeding. We describe the methods that achieved the highest scores in the competition. Moreover, we present a direct comparison with the previous edition of the challenge (the Music Demixing Challenge 2021): the best performing system achieved an improvement of over 1.6dB in signal-to-distortion ratio over the winner of the previous competition, when evaluated on MDXDB21. Besides relying on the signal-to-distortion ratio as objective metric, we also performed a listening test with renowned producers and musicians to study the perceptual quality of the systems and report here the results. Finally, we provide our insights into the organization of the competition and our prospects for future editions., Comment: Published in Transactions of the International Society for Music Information Retrieval (https://transactions.ismir.net/articles/10.5334/tismir.171)
Published: 2023
Full Text: View/download PDF

12. From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion

Author: Roman, Robin San, Adi, Yossi, Deleforge, Antoine, Serizel, Romain, Synnaeve, Gabriel, and Défossez, Alexandre
Subjects: Computer Science - Sound, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Deep generative models can generate high-fidelity audio conditioned on various types of representations (e.g., mel-spectrograms, Mel-frequency Cepstral Coefficients (MFCC)). Recently, such models have been used to synthesize audio waveforms conditioned on highly compressed representations. Although such methods produce impressive results, they are prone to generate audible artifacts when the conditioning is flawed or imperfect. An alternative modeling approach is to use diffusion models. However, these have mainly been used as speech vocoders (i.e., conditioned on mel-spectrograms) or generating relatively low sampling rate signals. In this work, we propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality (e.g., speech, music, environmental sounds) from low-bitrate discrete representations. At equal bit rate, the proposed approach outperforms state-of-the-art generative techniques in terms of perceptual quality. Training and, evaluation code, along with audio samples, are available on the facebookresearch/audiocraft Github page., Comment: 10 pages
Published: 2023

13. A novel bioinformatic approach reveals cooperation between Cancer/Testis genes in basal-like breast tumors

Author: Laisné, Marthe, Rodgers, Brianna, Benlamara, Sarah, Wicinski, Julien, Nicolas, André, Djerroudi, Lounes, Gupta, Nikhil, Ferry, Laure, Kirsh, Olivier, Daher, Diana, Philippe, Claude, Okada, Yuki, Charafe-Jauffret, Emmanuelle, Cristofari, Gael, Meseure, Didier, Vincent-Salomon, Anne, Ginestier, Christophe, and Defossez, Pierre-Antoine
Published: 2024
Full Text: View/download PDF

14. Simple and Controllable Music Generation

Author: Copet, Jade, Kreuk, Felix, Gat, Itai, Remez, Tal, Kant, David, Synnaeve, Gabriel, Adi, Yossi, and Défossez, Alexandre
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We tackle the task of conditional music generation. We introduce MusicGen, a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns, which eliminates the need for cascading several models, e.g., hierarchically or upsampling. Following this approach, we demonstrate how MusicGen can generate high-quality samples, both mono and stereo, while being conditioned on textual description or melodic features, allowing better controls over the generated output. We conduct extensive empirical evaluation, considering both automatic and human studies, showing the proposed approach is superior to the evaluated baselines on a standard text-to-music benchmark. Through ablation studies, we shed light over the importance of each of the components comprising MusicGen. Music samples, code, and models are available at https://github.com/facebookresearch/audiocraft, Comment: Published at Neurips 2023
Published: 2023

15. Textually Pretrained Speech Language Models

Author: Hassid, Michael, Remez, Tal, Nguyen, Tu Anh, Gat, Itai, Conneau, Alexis, Kreuk, Felix, Copet, Jade, Defossez, Alexandre, Synnaeve, Gabriel, Dupoux, Emmanuel, Schwartz, Roy, and Adi, Yossi
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Speech language models (SpeechLMs) process and generate acoustic data only, without textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models. We show using both automatic and human evaluations that TWIST outperforms a cold-start SpeechLM across the board. We empirically analyze the effect of different model design choices such as the speech tokenizer, the pretrained textual model, and the dataset size. We find that model and dataset scale both play an important role in constructing better-performing SpeechLMs. Based on our observations, we present the largest (to the best of our knowledge) SpeechLM both in terms of number of parameters and training data. We additionally introduce two spoken versions of the StoryCloze textual benchmark to further improve model evaluation and advance future research in the field. We make speech samples, code and models publicly available: https://pages.cs.huji.ac.il/adiyoss-lab/twist/ ., Comment: NeurIPS 2023
Published: 2023

16. Hybrid Transformers for Music Source Separation

Author: Rouard, Simon, Massa, Francisco, and Défossez, Alexandre
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: A natural question arising in Music Source Separation (MSS) is whether long range contextual information is useful, or whether local acoustic features are sufficient. In other fields, attention based Transformers have shown their ability to integrate information over long sequences. In this work, we introduce Hybrid Transformer Demucs (HT Demucs), an hybrid temporal/spectral bi-U-Net based on Hybrid Demucs, where the innermost layers are replaced by a cross-domain Transformer Encoder, using self-attention within one domain, and cross-attention across domains. While it performs poorly when trained only on MUSDB, we show that it outperforms Hybrid Demucs (trained on the same data) by 0.45 dB of SDR when using 800 extra training songs. Using sparse attention kernels to extend its receptive field, and per source fine-tuning, we achieve state-of-the-art results on MUSDB with extra training data, with 9.20 dB of SDR.
Published: 2022

17. Audio Language Modeling using Perceptually-Guided Discrete Representations

Author: Kreuk, Felix, Taigman, Yaniv, Polyak, Adam, Copet, Jade, Synnaeve, Gabriel, Défossez, Alexandre, and Adi, Yossi
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this work, we study the task of Audio Language Modeling, in which we aim at learning probabilistic models for audio that can be used for generation and completion. We use a state-of-the-art perceptually-guided audio compression model, to encode audio to discrete representations. Next, we train a transformer-based causal language model using these representations. At inference time, we perform audio auto-completion by encoding an audio prompt as a discrete sequence, feeding it to the audio language model, sampling from the model, and synthesizing the corresponding time-domain signal. We evaluate the quality of samples generated by our method on Audioset, the largest dataset for general audio to date, and show that it is superior to the evaluated baseline audio encoders. We additionally provide an extensive analysis to better understand the trade-off between audio-quality and language-modeling capabilities. Samples:link.
Published: 2022

18. High Fidelity Neural Audio Compression

Author: Défossez, Alexandre, Copet, Jade, Synnaeve, Gabriel, and Adi, Yossi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence, Computer Science - Sound, Statistics - Machine Learning
Abstract: We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks. It consists in a streaming encoder-decoder architecture with quantized latent space trained in an end-to-end fashion. We simplify and speed-up the training by using a single multiscale spectrogram adversary that efficiently reduces artifacts and produce high-quality samples. We introduce a novel loss balancer mechanism to stabilize training: the weight of a loss now defines the fraction of the overall gradient it should represent, thus decoupling the choice of this hyper-parameter from the typical scale of the loss. Finally, we study how lightweight Transformer models can be used to further compress the obtained representation by up to 40%, while staying faster than real time. We provide a detailed description of the key design choices of the proposed model including: training objective, architectural changes and a study of various perceptual loss functions. We present an extensive subjective evaluation (MUSHRA tests) together with an ablation study for a range of bandwidths and audio domains, including speech, noisy-reverberant speech, and music. Our approach is superior to the baselines methods across all evaluated settings, considering both 24 kHz monophonic and 48 kHz stereophonic audio. Code and models are available at github.com/facebookresearch/encodec., Comment: Preprint
Published: 2022

19. Non-canonical functions of UHRF1 maintain DNA methylation homeostasis in cancer cells

Author: Kosuke Yamaguchi, Xiaoying Chen, Brianna Rodgers, Fumihito Miura, Pavel Bashtrykov, Frédéric Bonhomme, Catalina Salinas-Luypaert, Deis Haxholli, Nicole Gutekunst, Bihter Özdemir Aygenli, Laure Ferry, Olivier Kirsh, Marthe Laisné, Andrea Scelfo, Enes Ugur, Paola B. Arimondo, Heinrich Leonhardt, Masato T. Kanemaki, Till Bartke, Daniele Fachinetti, Albert Jeltsch, Takashi Ito, and Pierre-Antoine Defossez
Subjects: Science
Abstract: Abstract DNA methylation is an essential epigenetic chromatin modification, and its maintenance in mammals requires the protein UHRF1. It is yet unclear if UHRF1 functions solely by stimulating DNA methylation maintenance by DNMT1, or if it has important additional functions. Using degron alleles, we show that UHRF1 depletion causes a much greater loss of DNA methylation than DNMT1 depletion. This is not caused by passive demethylation as UHRF1-depleted cells proliferate more slowly than DNMT1-depleted cells. Instead, bioinformatics, proteomics and genetics experiments establish that UHRF1, besides activating DNMT1, interacts with DNMT3A and DNMT3B and promotes their activity. In addition, we show that UHRF1 antagonizes active DNA demethylation by TET2. Therefore, UHRF1 has non-canonical roles that contribute importantly to DNA methylation homeostasis; these findings have practical implications for epigenetics in health and disease.
Published: 2024
Full Text: View/download PDF

20. AudioGen: Textually Guided Audio Generation

Author: Kreuk, Felix, Synnaeve, Gabriel, Polyak, Adam, Singer, Uriel, Défossez, Alexandre, Copet, Jade, Parikh, Devi, Taigman, Yaniv, and Adi, Yossi
Subjects: Computer Science - Sound, Computer Science - Computation and Language, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We tackle the problem of generating audio samples conditioned on descriptive text captions. In this work, we propose AaudioGen, an auto-regressive generative model that generates audio samples conditioned on text inputs. AudioGen operates on a learnt discrete audio representation. The task of text-to-audio generation poses multiple challenges. Due to the way audio travels through a medium, differentiating ``objects'' can be a difficult task (e.g., separating multiple people simultaneously speaking). This is further complicated by real-world recording conditions (e.g., background noise, reverberation, etc.). Scarce text annotations impose another constraint, limiting the ability to scale models. Finally, modeling high-fidelity audio requires encoding audio at high sampling rate, leading to extremely long sequences. To alleviate the aforementioned challenges we propose an augmentation technique that mixes different audio samples, driving the model to internally learn to separate multiple sources. We curated 10 datasets containing different types of audio and text annotations to handle the scarcity of text-audio data points. For faster inference, we explore the use of multi-stream modeling, allowing the use of shorter sequences while maintaining a similar bitrate and perceptual quality. We apply classifier-free guidance to improve adherence to text. Comparing to the evaluated baselines, AudioGen outperforms over both objective and subjective metrics. Finally, we explore the ability of the proposed method to generate audio continuation conditionally and unconditionally. Samples: https://felixkreuk.github.io/audiogen, Comment: Accepted to ICLR 2023
Published: 2022

21. Decoding speech perception from non-invasive brain recordings

Author: Défossez, Alexandre, Caucheteux, Charlotte, Rapin, Jérémy, Kabeli, Ori, and King, Jean-Rémi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition
Abstract: Decoding speech from brain activity is a long-awaited goal in both healthcare and neuroscience. Invasive devices have recently led to major milestones in that regard: deep learning algorithms trained on intracranial recordings now start to decode elementary linguistic features (e.g. letters, words, spectrograms). However, extending this approach to natural speech and non-invasive brain recordings remains a major challenge. Here, we introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from the non-invasive recordings of a large cohort of healthy individuals. To evaluate this approach, we curate and integrate four public datasets, encompassing 175 volunteers recorded with magneto- or electro-encephalography (M/EEG), while they listened to short stories and isolated sentences. The results show that our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities on average across participants, and more than 80% in the very best participants - a performance that allows the decoding of words and phrases absent from the training set. The comparison of our model to a variety of baselines highlights the importance of (i) a contrastive objective, (ii) pretrained representations of speech and (iii) a common convolutional architecture simultaneously trained across multiple participants. Finally, the analysis of the decoder's predictions suggests that they primarily depend on lexical and contextual semantic representations. Overall, this effective decoding of perceived speech from non-invasive recordings delineates a promising path to decode language from brain activity, without putting patients at risk for brain surgery., Comment: updated version following publication in Nature Machine Intelligence (2023)
Published: 2022
Full Text: View/download PDF

22. Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain

Author: Markovic, Dejan, Defossez, Alexandre, and Richard, Alexander
Subjects: Computer Science - Sound, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic acoustic scene. We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources. The model is trained end-to-end and performs spatial processing implicitly, without any components based on traditional processing or use of hand-crafted spatial features. We evaluate the proposed model on a real-world dataset and show that the model matches the performance of an oracle beamformer followed by a state-of-the-art single-channel enhancement network., Comment: Interspeech 2022
Published: 2022

23. Variation in insect herbivory across an urbanization gradient: The role of abiotic factors and leaf secondary metabolites

Author: Moreira, Xoaquín, Van den Bossche, Astrid, Moeys, Karlien, Van Meerbeek, Koenraad, Thomaes, Arno, Vázquez-González, Carla, Abdala-Roberts, Luis, Brunet, Jörg, Cousins, Sara A.O., Defossez, Emmanuel, De Pauw, Karen, Diekmann, Martin, Glauser, Gaétan, Graae, Bente J., Hagenblad, Jenny, Heavyside, Paige, Hedwall, Per-Ola, Heinken, Thilo, Huang, Siyu, Lago-Núñez, Beatriz, Lenoir, Jonathan, Lindgren, Jessica, Lindmo, Sigrid, Mazalla, Leonie, Naaf, Tobias, Orczewska, Anna, Paulssen, Jolina, Plue, Jan, Rasmann, Sergio, Spicher, Fabien, Vanneste, Thomas, Verschuren, Louis, Visakorpi, Kristiina, Wulf, Monika, and De Frenne, Pieter
Published: 2024
Full Text: View/download PDF

24. Hybrid Spectrogram and Waveform Source Separation

Author: Défossez, Alexandre
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound, Statistics - Machine Learning
Abstract: Source separation models either work on the spectrogram or waveform domain. In this work, we show how to perform end-to-end hybrid source separation, letting the model decide which domain is best suited for each source, and even combining both. The proposed hybrid version of the Demucs architecture won the Music Demixing Challenge 2021 organized by Sony. This architecture also comes with additional improvements, such as compressed residual branches, local attention or singular value regularization. Overall, a 1.4 dB improvement of the Signal-To-Distortion (SDR) was observed across all sources as measured on the MusDB HQ dataset, an improvement confirmed by human subjective evaluation, with an overall quality rated at 2.83 out of 5 (2.36 for the non hybrid Demucs), and absence of contamination at 3.04 (against 2.37 for the non hybrid Demucs and 2.44 for the second ranking model submitted at the competition)., Comment: ISMIR 2021 MDX Workshop, 11 pages, 2 figures
Published: 2021

25. Multi-scale datasets for monitoring Mediterranean oak forests from optical remote sensing during the SENTHYMED/MEDOAK experiment in the north of Montpellier (France)

Author: K. Adeline, J.B. Féret, H. Clenet, J.M. Limousin, J.M. Ourcival, F. Mouillot, S. Alleaume, A. Jolivot, X. Briottet, L. Bidel, E. Aria, ATM. Defossez, T. Gaubert, J. Giffard-Carlet, J. Kempf, D. Longepierre, F. Lopez, T. Miraglio, J. Vigouroux, and M. Debue
Subjects: Oak forests, Species inventory, Canopy plant area index, Leaf traits, Optical properties, UAV-borne LiDAR data, Computer applications to medicine. Medical informatics, R858-859.7, Science (General), Q1-390
Abstract: Mediterranean forests represent critical areas that are increasingly affected by the frequency of droughts and fires, anthropic activities and land use changes. Optical remote sensing data give access to several essential biodiversity variables, such as species traits (related to vegetation biophysical and biochemical composition), which can help to better understand the structure and functioning of these forests. However, their reliability highly depends on the scale of observation and the spectral configuration of the sensor. Thus, the objective of the SENTHYMED/MEDOAK experiment is to provide datasets from leaf to canopy scale in synchronization with remote sensing acquisitions obtained from multi-platform sensors having different spectral characteristics and spatial resolutions. Seven monthly data collections were performed between April and October 2021 (with a complementary one in June 2023) over two forests in the north of Montpellier, France, comprised of two oak endemic species with different phenological dynamics (evergreen: Quercus ilex and deciduous: Quercus pubescens) and a variability of canopy cover fractions (from dense to open canopy). These collections were coincident with satellite multispectral Sentinel-2 data and one with airborne hyperspectral AVIRIS-Next Generation data. In addition, satellite hyperspectral PRISMA and DESIS were also available for some dates. All these airborne and satellite data are provided from free online download websites. Eight datasets are presented in this paper from thirteen studied forest plots: (1) overstory and understory inventory, (2) 687 canopy plant area index from Li-COR plant canopy analyzers, (3) 1475 in situ spectral reflectances (oak canopy, trunk, grass, limestone, etc.) from ASD spectroradiometers, (4) 92 soil moistures and temperatures from IMKO and Campbell probes, (5) 747 leaf-clip optical data from SPAD and DUALEX sensors, (6) 2594 in-lab leaf directional-hemispherical reflectances and transmittances from ASD spectroradiometer coupled with an integrating sphere, (7) 747 in-lab measured leaf water and dry matter content, and additional leaf traits by inversion of the PROSPECT model and (8) UAV-borne LiDAR 3-D point clouds. These datasets can be useful for multi-scale and multi-temporal calibration/validation of high level satellite vegetation products such as species traits, for current and future imaging spectroscopic missions, and by fusing or comparing both multispectral and hyperspectral data. Other targeted applications can be forest 3-D modelling, biodiversity assessment, fire risk prevention and globally vegetation monitoring.
Published: 2024
Full Text: View/download PDF

26. Music Demixing Challenge 2021

Author: Mitsufuji, Yuki, Fabbro, Giorgio, Uhlich, Stefan, Stöter, Fabian-Robert, Défossez, Alexandre, Kim, Minseok, Choi, Woosung, Yu, Chin-Yun, and Cheuk, Kin-Wai
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Music source separation has been intensively studied in the last decade and tremendous progress with the advent of deep learning could be observed. Evaluation campaigns such as MIREX or SiSEC connected state-of-the-art models and corresponding papers, which can help researchers integrate the best practices into their models. In recent years, the widely used MUSDB18 dataset played an important role in measuring the performance of music source separation. While the dataset made a considerable contribution to the advancement of the field, it is also subject to several biases resulting from a focus on Western pop music and a limited number of mixing engineers being involved. To address these issues, we designed the Music Demixing (MDX) Challenge on a crowd-based machine learning competition platform where the task is to separate stereo songs into four instrument stems (Vocals, Drums, Bass, Other). The main differences compared with the past challenges are 1) the competition is designed to more easily allow machine learning practitioners from other disciplines to participate, 2) evaluation is done on a hidden test set created by music professionals dedicated exclusively to the challenge to assure the transparency of the challenge, i.e., the test set is not accessible from anyone except the challenge organizers, and 3) the dataset provides a wider range of music genres and involved a greater number of mixing engineers. In this paper, we provide the details of the datasets, baselines, evaluation metrics, evaluation results, and technical challenges for future competitions.
Published: 2021
Full Text: View/download PDF

27. Microfluidic-based production of [68Ga]Ga-FAPI-46 and [68Ga]Ga-DOTA-TOC using the cassette-based iMiDEV™ microfluidic radiosynthesizer

Author: Hemantha Mallapura, Olga Ovdiichuk, Emma Jussing, Tran A. Thuy, Camille Piatkowski, Laurent Tanguy, Charlotte Collet-Defossez, Bengt Långström, Christer Halldin, and Sangram Nag
Subjects: Positron emission tomography (PET), Radiopharmaceuticals, Microfluidics, iMiDEV, [68Ga]Ga-FAPI-46, [68Ga]Ga-DOTA-TOC, Medical physics. Medical radiology. Nuclear medicine, R895-920, Therapeutics. Pharmacology, RM1-950
Abstract: Abstract Background The demand for 68Ga-labeled radiotracers has significantly increased in the past decade, driven by the development of diversified imaging tracers, such as FAPI derivatives, PSMA-11, DOTA-TOC, and DOTA-TATE. These tracers have exhibited promising results in theranostic applications, fueling interest in exploring them for clinical use. Among these probes, 68Ga-labeled FAPI-46 and DOTA-TOC have emerged as key players due to their ability to diagnose a broad spectrum of cancers ([68Ga]Ga-FAPI-46) in late-phase studies, whereas [68Ga]Ga-DOTA-TOC is clinically approved for neuroendocrine tumors. To facilitate their production, we leveraged a microfluidic cassette-based iMiDEV radiosynthesizer, enabling the synthesis of [68Ga]Ga-FAPI-46 and [68Ga]Ga-DOTA-TOC based on a dose-on-demand (DOD) approach. Results Different mixing techniques were explored to influence radiochemical yield. We achieved decay-corrected yield of 44 ± 5% for [68Ga]Ga-FAPI-46 and 46 ± 7% for [68Ga]Ga-DOTA-TOC in approximately 30 min. The radiochemical purities (HPLC) of [68Ga]Ga-FAPI-46 and [68Ga]Ga-DOTA-TOC were 98.2 ± 0.2% and 98.4 ± 0.9%, respectively. All the quality control results complied with European Pharmacopoeia quality standards. We optimized various parameters, including 68Ga trapping and elution, cassette batches, passive mixing in the reactor, and solid-phase extraction (SPE) purification and formulation. The developed synthesis method reduced the amount of precursor and other chemicals required for synthesis compared to conventional radiosynthesizers. Conclusions The microfluidic-based approach enabled the implementation of radiosynthesis of [68Ga]Ga-FAPI-46 and [68Ga]Ga-DOTA-TOC on the iMiDEV™ microfluidic module, paving the way for their use in preclinical and clinical applications. The microfluidic synthesis approach utilized 2–3 times less precursor than cassette-based conventional synthesis. The synthesis method was also successfully validated in a similar microfluidic iMiDEV module at a different research center for the synthesis of [68Ga]Ga-FAPI-46 with limited runs. Our study demonstrated the potential of microfluidic methods for efficient and reliable radiometal-based radiopharmaceutical synthesis, contributing valuable insights for future advancements in this field and paving the way for routine clinical applications in the near future.
Published: 2023
Full Text: View/download PDF

28. RSK3 switches cell fate: from stress-induced senescence to malignant progression

Author: Anda Huna, Jean-Michel Flaman, Catalina Lodillinsky, Kexin Zhu, Gabriela Makulyte, Victoria Pakulska, Yohann Coute, Clémence Ruisseaux, Pierre Saintigny, Hector Hernandez-Vargas, Pierre-Antoine Defossez, Mathieu Boissan, Nadine Martin, and David Bernard
Subjects: Cellular senescence, Epithelial-mesenchymal transition, TGFβ, Breast tumor, Neoplasms. Tumors. Oncology. Including cancer and carcinogens, RC254-282
Abstract: Abstract Background TGFβ induces several cell phenotypes including senescence, a stable cell cycle arrest accompanied by a secretory program, and epithelial-mesenchymal transition (EMT) in normal epithelial cells. During carcinogenesis cells lose the ability to undergo senescence in response to TGFβ but they maintain an EMT, which can contribute to tumor progression. Our aim was to identify mechanisms promoting TGFβ-induced senescence escape. Methods In vitro experiments were performed with primary human mammary epithelial cells (HMEC) immortalized by hTert. For kinase library screen and modulation of gene expression retroviral transduction was used. To characterize gene expression, RNA microarray with GSEA analysis and RT-qPCR were used. For protein level and localization, Western blot and immunofluorescence were performed. For senescence characterization crystal violet assay, Senescence Associated-β-Galactosidase activity, EdU staining were conducted. To determine RSK3 partners FLAG-baited immunoprecipitation and mass spectrometry-based proteomic analyses were performed. Proteosome activity and proteasome enrichment assays were performed. To validate the role of RSK3 in human breast cancer, analysis of METABRIC database was performed. Murine intraductal xenografts using MCF10DCIS.com cells were carried out, with histological and immunofluorescence analysis of mouse tissue sections. Results A screen with active kinases in HMECs upon TGFβ treatment identified that the serine threonine kinase RSK3, or RPS6KA2, a kinase mainly known to regulate cancer cell death including in breast cancer, reverted TGFβ-induced senescence. Interestingly, RSK3 expression decreased in response to TGFβ in a SMAD3-dependent manner, and its constitutive expression rescued SMAD3-induced senescence, indicating that a decrease in RSK3 itself contributes to TGFβ-induced senescence. Using transcriptomic analyses and affinity purification coupled to mass spectrometry-based proteomics, we unveiled that RSK3 regulates senescence by inhibiting the NF-κΒ pathway through the decrease in proteasome-mediated IκBα degradation. Strikingly, senescent TGFβ-treated HMECs display features of epithelial to mesenchymal transition (EMT) and during RSK3-induced senescence escaped HMECs conserve EMT features. Importantly, RSK3 expression is correlated with EMT and invasion, and inversely correlated with senescence and NF-κΒ in human claudin-low breast tumors and its expression enhances the formation of breast invasive tumors in the mouse mammary gland. Conclusions We conclude that RSK3 switches cell fate from senescence to malignancy in response to TGFβ signaling.
Published: 2023
Full Text: View/download PDF

29. A genome-wide screen reveals new regulators of the 2-cell-like cell state

Author: Gupta, Nikhil, Yakhou, Lounis, Albert, Julien Richard, Azogui, Anaelle, Ferry, Laure, Kirsh, Olivier, Miura, Fumihito, Battault, Sarah, Yamaguchi, Kosuke, Laisné, Marthe, Domrane, Cécilia, Bonhomme, Frédéric, Sarkar, Arpita, Delagrange, Marine, Ducos, Bertrand, Cristofari, Gael, Ito, Takashi, Greenberg, Maxim V. C., and Defossez, Pierre-Antoine
Published: 2023
Full Text: View/download PDF

30. Long-term survival for lymphoid neoplasms and national health expenditure (EUROCARE-6): a retrospective, population-based study

Author: Hackl, Monika, Van Eycken, Elizabeth, Van Damme, Nancy, Valerianova, Zdravka, Sekerija, Mario, Scoutellas, Vasos, Demetriou, Anna, Dušek, Ladislav, Krejici, Denisa, Storm, Hans, Mägi, Margit, Innos, Kaire, Pitkäniemi, Janne, Velten, Michel, Troussard, Xavier, Bouvier, Anne-Marie, Jooste, Valerie, Guizard, Anne-Valérie, Launoy, Guy, Dabakuyo Yonli, Sandrine, Maynadié, Marc, Woronoff, Anne-Sophie, Nousbaum, Jean-Baptiste, Coureau, Gaëlle, Monnereau, Alain, Baldi, Isabelle, Hammas, Karima, Tretarre, Brigitte, Colonna, Marc, Plouvier, Sandrine, D'Almeida, Tania, Molinié, Florence, Cowppli-Bony, Anne, Bara, Simona, Debreuve, Adeline, Defossez, Gautier, Lapôtre-Ledoux, Bénédicte, Grosclaude, Pascale, Daubisse-Marliac, Laetitia, Luttmann, Sabine, Eberle, Andrea, Stabenow, Roland, Nennecke, Alice, Kieschke, Joachim, Zeissig, Sylke, Holleczek, Bernd, Katalinic, Alexander, Birgisson, Helgi, Murray, Deirdre, Walsh, Paul M, Mazzoleni, Guido, Vittadello, Fabio, Cuccaro, Francesco, Galasso, Rocco, Sampietro, Giuseppe, Rosso, Stefano, Gasparotti, Cinzia, Maifredi, Giovanni, Ferrante, Margherita, Ragusa, Rosalia, Sutera Sardo, Antonella, Gambino, Maria Letizia, Lanzoni, Monica, Ballotari, Paola, Giacomazzi, Erica, Ferretti, Stefano, Caldarella, Adele, Manneschi, Gianfranco, Gatta, Gemma, Sant, Milena, Baili, Paolo, Berrino, Franco, Botta, Laura, Trama, Annalisa, Lillini, Roberto, Bernasconi, Alice, Bonfarnuzzo, Simone, Vener, Claudia, Didonè, Fabio, Lasalvia, Paolo, Buratti, Lucia, Tagliabue, Giovanna, Serraino, Diego, Dal Maso, Luigino, Capocaccia, Riccardo, De Angelis, Roberta, Demuru, Elena, Cerza, Francesco, Di Mari, Fabrizio, Di Benedetto, Corrado, Rossi, Silvia, Santaquilani, Mariano, Venanzi, Serenella, Tallon, Marco, Boni, Luca, Iacovacci, Silvia, Gennaro, Valerio, Russo, Antonio Giampiero, Gervasi, Federico, Spagnoli, Gianbattista, Cavalieri d'Oro, Luca, Fusco, Mario, Vitale, Maria Francesca, Usala, Mario, Mazzucco, Walter, Michiara, Maria, Chiranda, Giorgio, Cascone, Giuseppe, Rollo, Concetta Patrizia, Mangone, Lucia, Falcini, Fabio, Cavallo, Rossella, Piras, Daniela, Madeddu, Anselmo, Bella, Francesca, Fanetti, Anna Clara, Minerba, Sante, Candela, Giuseppina, Scuderi, Tiziana, Rizzello, Roberto Vito, Stracci, Fabrizio, Rugge, Massimo, Brustolin, Angelita, Pildava, Santa, Smailyte, Giedre, Azzopardi, Miriam, Johannesen, Tom Børge, Didkowska, Joanna, Wojciechowska, Urszula, Bielska-Lasota, Magdalena, Pais, Ana, Bento, Maria José, Ferreira, Ana Maia, Lourenço, António, Safaei Diba, Chakameh, Zadnik, Vesna, Zagar, Tina, Sánchez-Contador Escudero, Carmen, Franch Sureda, Paula, Lopez de Munain, Arantza, De-La-Cruz, Marta, Rojas, María Dolores, Aleman, Araceli, Vizcaino, Ana, Marcos-Gragera, Rafael, Sanvisens, Arantza, Sanchez, Maria Josè, Chirlaque Lopez, Maria Dolores, Sanchez-Gil, Antonia, Guevara, Marcela, Ardanaz, Eva, Galceran, Jaume, Carulla, Maria, Bergeron, Yvan, Bouchardy, Christine, Mohsen Mousavi, Seyed, Went, Philip, Blum, Marcel, Bordoni, Andrea, Visser, Otto, Stevens, Sarah, Broggio, John, Bennett, Damien, Gavin, Anna, Morrison, David, Huws, Dyfed Wyn, Paapsi, Keiu, Mousavi, Seyed Mohsen, and Sánchez, Maria-Jose
Published: 2024
Full Text: View/download PDF

31. Net survival in colon and rectal cancer by stage according to neoadjuvant treatment. A French population-based study

Author: Jooste, Valérie, Grosclaude, Pascale, Defossez, Gautier, Daubisse, Laetitia, Woronoff, Anne-Sophie, Bouvier, Véronique, Chirpaz, Emmanuel, Tretarre, Brigitte, Lapotre, Bénédicte, Plouvier, Sandrine, Launoy, Guy, Bonneault, Mélanie, Molinié, Florence, and Bouvier, Anne-Marie
Published: 2024
Full Text: View/download PDF

32. Multimodal PET/PAI/FLI imaging probe based on meso-O-alkyl heptamethine cyanine: Synthesis, [18F]F-radiolabelling and photophysical characterizations

Author: Jouad, Kamal, Mengel, Emilien, Selmeczi, Katalin, Bouché, Mathilde, Collet-Defossez, Charlotte, Pellegrini Moïse, Nadia, and Lamandé-Langle, Sandrine
Published: 2024
Full Text: View/download PDF

33. Socio-demographic inequalities in stage at diagnosis of lung cancer: A French population-based study

Author: Quillet, Alexandre, Le Stang, Nolwenn, Meriau, Nicolas, Isambert, Nicolas, and Defossez, Gautier
Published: 2024
Full Text: View/download PDF

34. Differentiable Model Compression via Pseudo Quantization Noise

Author: Défossez, Alexandre, Adi, Yossi, and Synnaeve, Gabriel
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We propose DiffQ a differentiable method for model compression for quantizing model parameters without gradient approximations (e.g., Straight Through Estimator). We suggest adding independent pseudo quantization noise to model parameters during training to approximate the effect of a quantization operator. DiffQ is differentiable both with respect to the unquantized weights and the number of bits used. Given a single hyper-parameter balancing between the quantized model size and accuracy, DiffQ optimizes the number of bits used per individual weight or groups of weights, in end-to-end training. We experimentally verify that our method is competitive with STE based quantization techniques on several benchmarks and architectures for image classification, language modeling, and audio source separation. For instance, on the ImageNet dataset, DiffQ compresses a 12 layers transformer-based model by more than a factor of 8, (lower than 4 bits precision per weight on average), with a loss of 0.3% in model accuracy. Code is available at github.com/facebookresearch/diffq., Comment: final TMLR version
Published: 2021

35. Deep Recurrent Encoder: A scalable end-to-end network to model brain signals

Author: Chehab, Omar, Defossez, Alexandre, Loiseau, Jean-Christophe, Gramfort, Alexandre, and King, Jean-Remi
Subjects: Quantitative Biology - Neurons and Cognition, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Abstract: Understanding how the brain responds to sensory inputs is challenging: brain recordings are partial, noisy, and high dimensional; they vary across sessions and subjects and they capture highly nonlinear dynamics. These challenges have led the community to develop a variety of preprocessing and analytical (almost exclusively linear) methods, each designed to tackle one of these issues. Instead, we propose to address these challenges through a specific end-to-end deep learning architecture, trained to predict the brain responses of multiple subjects at once. We successfully test this approach on a large cohort of magnetoencephalography (MEG) recordings acquired during a one-hour reading task. Our Deep Recurrent Encoding (DRE) architecture reliably predicts MEG responses to words with a three-fold improvement over classic linear methods. To overcome the notorious issue of interpretability of deep learning, we describe a simple variable importance analysis. When applied to DRE, this method recovers the expected evoked responses to word length and word frequency. The quantitative improvement of the present deep learning approach paves the way to better understand the nonlinear dynamics of brain activity from large datasets.
Published: 2021

36. Complete cancer prevalence in Europe in 2020 by disease duration and country (EUROCARE-6): a population-based study

Author: Hackl, Monika, Van Eycken, Elizabeth, Van Damme, Nancy, Valerianova, Zdravka, Sekerija, Mario, Scoutellas, Vasos, Demetriou, Anna, Dušek, Ladislav, Krejici, Denisa, Storm, Hans, Mägi, Margit, Innos, Kaire, Pitkäniemi, Janne, Velten, Michel, Troussard, Xavier, Bouvier, Anne-Marie, Jooste, Valerie, Guizard, Anne-Valérie, Launoy, Guy, Dabakuyo Yonli, Sandrine, Maynadié, Marc, Woronoff, Anne-Sophie, Nousbaum, Jean-Baptiste, Coureau, Gaëlle, Monnereau, Alain, Baldi, Isabelle, Hammas, Karima, Tretarre, Brigitte, Colonna, Marc, Plouvier, Sandrine, D'Almeida, Tania, Molinié, Florence, Cowppli-Bony, Anne, Bara, Simona, Debreuve, Adeline, Defossez, Gautier, Lapôtre-Ledoux, Bénédicte, Grosclaude, Pascale, Daubisse-Marliac, Laetitia, Luttmann, Sabine, Stabenow, Roland, Nennecke, Alice, Kieschke, Joachim, Zeissig, Sylke, Holleczek, Bernd, Katalinic, Alexander, Birgisson, Helgi, Murray, Deirdre, Walsh, Paul M., Mazzoleni, Guido, Vittadello, Fabio, Cuccaro, Francesco, Galasso, Rocco, Sampietro, Giuseppe, Rosso, Stefano, Gasparotti, Cinzia, Maifredi, Giovanni, Ferrante, Margherita, Ragusa, Rosalia, Sutera Sardo, Antonella, Gambino, Maria Letizia, Lanzoni, Monica, Ballotari, Paola, Giacomazzi, Erica, Ferretti, Stefano, Caldarella, Adele, Manneschi, Gianfranco, Gatta, Gemma, Sant, Milena, Baili, Paolo, Berrino, Franco, Botta, Laura, Trama, Annalisa, Lillini, Roberto, Bernasconi, Alice, Bonfarnuzzo, Simone, Vener, Claudia, Didonè, Fabio, Lasalvia, Paolo, Buratti, Lucia, Tagliabue, Giovanna, Serraino, Diego, Dal Maso, Luigino, Capocaccia, Riccardo, De Angelis, Roberta, Demuru, Elena, Di Benedetto, Corrado, Rossi, Silvia, Santaquilani, Mariano, Venanzi, Serenella, Tallon, Marco, Boni, Luca, Iacovacci, Silvia, Gennaro, Valerio, Russo, Antonio Giampiero, Gervasi, Federico, Spagnoli, Gianbattista, Cavalieri d'Oro, Luca, Fusco, Mario, Vitale, Maria Francesca, Usala, Mario, Mazzucco, Walter, Michiara, Maria, Chiranda, Giorgio, Cascone, Giuseppe, Giurdanella, Maria Concetta, Mangone, Lucia, Falcini, Fabio, Cavallo, Rossella, Piras, Daniela, Madeddu, Anselmo, Bella, Francesca, Fanetti, Anna Clara, Minerba, Sante, Candela, Giuseppina, Scuderi, Tiziana, Rizzello, Roberto Vito, Stracci, Fabrizio, Rugge, Massimo, Brustolin, Angelita, Pildava, Santa, Smailyte, Giedre, Azzopardi, Miriam, Johannesen, Tom Børge, Didkowska, Joanna, Wojciechowska, Urszula, Bielska-Lasota, Magdalena, Pais, Ana, Bento, Maria José, Calisto, Rita, Lourenço, António, Safaei Diba, Chakameh, Zadnik, Vesna, Zagar, Tina, Sánchez-Contador Escudero, Carmen, Franch Sureda, Paula, Lopez de Munain, Arantza, De-La-Cruz, Marta, Rojas, Marìa Dolores, Aleman, Araceli, Vizcaino, Ana, Marcos-Gragera, Rafael, Sanvisens, Arantza, Sanchez, Maria Josè, Chirlaque Lopez, Maria Dolores, Sanchez-Gil, Antonia, Guevara, Marcela, Ardanaz, Eva, Galceran, Jaume, Carulla, Maria, Bergeron, Yvan, Bouchardy, Christine, Mohsen Mousavi, Seyed, Went, Philip, Blum, Marcel, Bordoni, Andrea, Visser, Otto, Stevens, Sarah, Broggio, John, Bennett, Damien, Gavin, Anna, Morrison, David, Huws, Dyfed Wyn, Ventura, Leonardo, Paapsi, Keiu, Randi, Giorgia, Bettio, Manola, and Guzzinati, Stefano
Published: 2024
Full Text: View/download PDF

37. Locus-level L1 DNA methylation profiling reveals the epigenetic and transcriptional interplay between L1s and their integration sites

Author: Lanciano, Sophie, Philippe, Claude, Sarkar, Arpita, Pratella, David, Domrane, Cécilia, Doucet, Aurélien J., van Essen, Dominic, Saccani, Simona, Ferry, Laure, Defossez, Pierre-Antoine, and Cristofari, Gael
Published: 2024
Full Text: View/download PDF

38. Mechanical vulnerability of beech (Fagus sylvatica L.) poles after thinning: Securing stem or roots is risk dependent

Author: Dlouhá, Jana, Défossez, Pauline, Dongmo Keumo Jiazet, Joel Hans, Ningre, François, Fournier, Meriem, and Constant, Thiéry
Published: 2024
Full Text: View/download PDF

39. Surgical patterns of care of pancreatic cancer. A French population-based study

Author: Bara, S., Bouvier, A.M., Jooste, V., Alves, A., Bouvier, V., Seigneurin, A., Coureau, G., Molinié, F., Dalmeida, T., Grosclaude, P., Daubisse-Marliac, L., Defossez, G., Guizard, A.V., Lapôtre-Ledoux, B., Hammas, K., Nousbaum, J.B., Plouvier, S., Trétarre, B., Velten, M., Woronoff, A.S., Goebel, Guillaume, Jooste, Valérie, Molinie, Florence, Grosclaude, Pascale, Woronoff, Anne-Sophie, Alves, Arnaud, Bouvier, Véronique, Nousbaum, Jean-Baptiste, Plouvier, Sandrine, Bengrine-Lefevre, Leila, Rabel, Thomas, and Bouvier, Anne-Marie
Published: 2024
Full Text: View/download PDF

40. The structure of behavioral data

Author: Defossez, Aurélien, Ansarinia, Morteza, Clocher, Brice, Schmück, Emmanuel, Schrater, Paul, and Cardoso-Leite, Pedro
Subjects: Quantitative Biology - Neurons and Cognition, Statistics - Methodology
Abstract: For more than a century, scientists have been collecting behavioral data--an increasing fraction of which is now being publicly shared so other researchers can reuse them to replicate, integrate or extend past results. Although behavioral data is fundamental to many scientific fields, there is currently no widely adopted standard for formatting, naming, organizing, describing or sharing such data. This lack of standardization is a major bottleneck for scientific progress. Not only does it prevent the effective reuse of data, it also affects how behavioral data in general are processed, as non-standard data calls for custom-made data analysis code and prevents the development of efficient tools. To address this problem, we develop the Behaverse Data Model (BDM), a standard for structuring behavioral data. Here we focus on major concepts in behavioral data, leaving further details and developments to the project's website (https://behaverse.github.io/data-model/)., Comment: 12 pages, 1 table, 2 figures
Published: 2020

41. Real Time Speech Enhancement in the Waveform Domain

Author: Defossez, Alexandre, Synnaeve, Gabriel, and Adi, Yossi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound, Statistics - Machine Learning
Abstract: We present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities. We perform evaluations on several standard benchmarks, both using objective metrics and human judgements. The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform., Comment: Interspeech 2020 Paper
Published: 2020

42. A Simple Convergence Proof of Adam and Adagrad

Author: Défossez, Alexandre, Bottou, Léon, Bach, Francis, and Usunier, Nicolas
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We provide a simple proof of convergence covering both the Adam and Adagrad adaptive optimization algorithms when applied to smooth (possibly non-convex) objective functions with bounded gradients. We show that in expectation, the squared norm of the objective gradient averaged over the trajectory has an upper-bound which is explicit in the constants of the problem, parameters of the optimizer, the dimension $d$, and the total number of iterations $N$. This bound can be made arbitrarily small, and with the right hyper-parameters, Adam can be shown to converge with the same rate of convergence $O(d\ln(N)/\sqrt{N})$. When used with the default parameters, Adam doesn't converge, however, and just like constant step-size SGD, it moves away from the initialization point faster than Adagrad, which might explain its practical success. Finally, we obtain the tightest dependency on the heavy ball momentum decay rate $\beta_1$ among all previous convergence bounds for non-convex Adam and Adagrad, improving from $O((1-\beta_1)^{-3})$ to $O((1-\beta_1)^{-1})$., Comment: final TMLR version
Published: 2020

43. Microfluidic-based production of [68Ga]Ga-FAPI-46 and [68Ga]Ga-DOTA-TOC using the cassette-based iMiDEV™ microfluidic radiosynthesizer

Author: Mallapura, Hemantha, Ovdiichuk, Olga, Jussing, Emma, Thuy, Tran A., Piatkowski, Camille, Tanguy, Laurent, Collet-Defossez, Charlotte, Långström, Bengt, Halldin, Christer, and Nag, Sangram
Published: 2023
Full Text: View/download PDF

44. RSK3 switches cell fate: from stress-induced senescence to malignant progression

Author: Huna, Anda, Flaman, Jean-Michel, Lodillinsky, Catalina, Zhu, Kexin, Makulyte, Gabriela, Pakulska, Victoria, Coute, Yohann, Ruisseaux, Clémence, Saintigny, Pierre, Hernandez-Vargas, Hector, Defossez, Pierre-Antoine, Boissan, Mathieu, Martin, Nadine, and Bernard, David
Published: 2023
Full Text: View/download PDF

45. Unchanged PCNA and DNMT1 dynamics during replication in DNA ligase I-deficient cells but abnormal chromatin levels of non-replicative histone H1

Author: Bhandari, Seema Khattri, Wiest, Nathaniel, Sallmyr, Annahita, Du, Ruofei, Ferry, Laure, Defossez, Pierre-Antoine, and Tomkinson, Alan E.
Published: 2023
Full Text: View/download PDF

46. Music Source Separation in the Waveform Domain

Author: Défossez, Alexandre, Usunier, Nicolas, Bottou, Léon, and Bach, Francis
Subjects: Computer Science - Sound, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing, Statistics - Machine Learning
Abstract: Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments.Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we compare two waveform domain architectures. We first adapt Conv-Tasnet, initially developed for speech source separation,to the task of music source separation. While Conv-Tasnet beats many existing spectrogram-domain methods, it suffersfrom significant artifacts, as shown by human evaluations. We propose instead Demucs, a novel waveform-to-waveform model,with a U-Net structure and bidirectional LSTM.Experiments on the MusDB dataset show that, with proper data augmentation, Demucs beats allexisting state-of-the-art architectures, including Conv-Tasnet, with 6.3 SDR on average, (and up to 6.8 with 150 extra training songs, even surpassing the IRM oracle for the bass source).Using recent development in model quantization, Demucs can be compressed down to 120MBwithout any loss of accuracy.We also provide human evaluations, showing that Demucs benefit from a large advantagein terms of the naturalness of the audio. However, it suffers from some bleeding,especially between the vocals and other source.
Published: 2019

47. Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed

Author: Défossez, Alexandre, Usunier, Nicolas, Bottou, Léon, and Bach, Francis
Subjects: Computer Science - Sound, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing, Statistics - Machine Learning
Abstract: We study the problem of source separation for music using deep learning with four known sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches predict soft masks over mixture spectrograms while methods working on the waveform are lagging behind as measured on the standard MusDB benchmark. Our contribution is two fold. (i) We introduce a simple convolutional and recurrent model that outperforms the state-of-the-art model on waveforms, that is, Wave-U-Net, by 1.6 points of SDR (signal to distortion ratio). (ii) We propose a new scheme to leverage unlabeled music. We train a first model to extract parts with at least one source silent in unlabeled tracks, for instance without bass. We remix this extract with a bass line taken from the supervised dataset to form a new weakly supervised training example. Combining our architecture and scheme, we show that waveform methods can play in the same ballpark as spectrogram ones.
Published: 2019

48. Unchanged PCNA and DNMT1 dynamics during replication in DNA ligase I-deficient cells but abnormal chromatin levels of non-replicative histone H1

Author: Seema Khattri Bhandari, Nathaniel Wiest, Annahita Sallmyr, Ruofei Du, Laure Ferry, Pierre-Antoine Defossez, and Alan E. Tomkinson
Subjects: Medicine, Science
Abstract: Abstract DNA ligase I (LigI), the predominant enzyme that joins Okazaki fragments, interacts with PCNA and Pol δ. LigI also interacts with UHRF1, linking Okazaki fragment joining with DNA maintenance methylation. Okazaki fragments can also be joined by a relatively poorly characterized DNA ligase IIIα (LigIIIα)-dependent backup pathway. Here we examined the effect of LigI-deficiency on proteins at the replication fork. Notably, LigI-deficiency did not alter the kinetics of association of the PCNA clamp, the leading strand polymerase Pol ε, DNA maintenance methylation proteins and core histones with newly synthesized DNA. While the absence of major changes in replication and methylation proteins is consistent with the similar proliferation rate and DNA methylation levels of the LIG1 null cells compared with the parental cells, the increased levels of LigIIIα/XRCC1 and Pol δ at the replication fork and in bulk chromatin indicate that there are subtle replication defects in the absence of LigI. Interestingly, the non-replicative histone H1 variant, H1.0, is enriched in the chromatin of LigI-deficient mouse CH12F3 and human 46BR.1G1 cells. This alteration was not corrected by expression of wild type LigI, suggesting that it is a relatively stable epigenetic change that may contribute to the immunodeficiencies linked with inherited LigI-deficiency syndrome.
Published: 2023
Full Text: View/download PDF

49. Trends in incidence of invasive vaginal cancer in France from 1990 to 2018 and survival of recently diagnosed women – A population-based study

Author: Trétarre, Brigitte, Dantony, Emmanuelle, Coureau, Gaëlle, Defossez, Gautier, Guizard, Anne-Valérie, Delafosse, Patricia, Daubisse, Laetitia, Velten, Michel, Karima Hammas, Barra, Simona, Lapotre, Bénédicte, Plouvier, Sandrine, d'Almeida, Tania, Molinié, Florence, and Woronoff, Anne-Sophie
Published: 2023
Full Text: View/download PDF

50. Context-dependent CpG methylation directs cell-specific binding of transcription factor ZBTB38

Author: Claire Marchal, Pierre-Antoine Defossez, and Benoit Miotto
Subjects: dna methylation, zinc finger protein zbtb38, e2f factor, alu repeats, cell cycle, Genetics, QH426-470
Abstract: DNA methylation on CpGs regulates transcription in mammals, both by decreasing the binding of methylation-repelled factors and by increasing the binding of methylation-attracted factors. Among the latter, zinc finger proteins have the potential to bind methylated CpGs in a sequence-specific context. The protein ZBTB38 is unique in that it has two independent sets of zinc fingers, which recognize two different methylated consensus sequences in vitro. Here, we identify the binding sites of ZBTB38 in a human cell line, and show that they contain the two methylated consensus sequences identified in vitro. In addition, we show that the distribution of ZBTB38 sites is highly unusual: while 10% of the ZBTB38 sites are also bound by CTCF, the other 90% of sites reside in closed chromatin and are not bound by any of the other factors mapped in our model cell line. Finally, a third of ZBTB38 sites are found upstream of long and active CpG islands. Our work therefore validates ZBTB38 as a methyl-DNA binder in vivo and identifies its unique distribution in the genome.
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,650 results on '"Defossez A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources