762 results on '"Rätsch, Gunnar"'
Search Results
2. Preference Elicitation for Offline Reinforcement Learning
- Author
-
Pace, Alizée, Schölkopf, Bernhard, Rätsch, Gunnar, and Ramponi, Giorgia
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Applying reinforcement learning (RL) to real-world problems is often made challenging by the inability to interact with the environment and the difficulty of designing reward functions. Offline RL addresses the first challenge by considering access to an offline dataset of environment interactions labeled by the reward function. In contrast, Preference-based RL does not assume access to the reward function and learns it from preferences, but typically requires an online interaction with the environment. We bridge the gap between these frameworks by exploring efficient methods for acquiring preference feedback in a fully offline setup. We propose Sim-OPRL, an offline preference-based reinforcement learning algorithm, which leverages a learned environment model to elicit preference feedback on simulated rollouts. Drawing on insights from both the offline RL and the preference-based RL literature, our algorithm employs a pessimistic approach for out-of-distribution data, and an optimistic approach for acquiring informative preferences about the optimal policy. We provide theoretical guarantees regarding the sample complexity of our approach, dependent on how well the offline data covers the optimal policy. Finally, we demonstrate the empirical performance of Sim-OPRL in different environments.
- Published
- 2024
3. Multi-Modal Contrastive Learning for Online Clinical Time-Series Applications
- Author
-
Baldenweg, Fabian, Burger, Manuel, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning - Abstract
Electronic Health Record (EHR) datasets from Intensive Care Units (ICU) contain a diverse set of data modalities. While prior works have successfully leveraged multiple modalities in supervised settings, we apply advanced self-supervised multi-modal contrastive learning techniques to ICU data, specifically focusing on clinical notes and time-series for clinically relevant online prediction tasks. We introduce a loss function Multi-Modal Neighborhood Contrastive Loss (MM-NCL), a soft neighborhood function, and showcase the excellent linear probe and zero-shot performance of our approach., Comment: Accepted as a Workshop Paper at TS4H@ICLR2024
- Published
- 2024
4. Dynamic Survival Analysis for Early Event Prediction
- Author
-
Yèche, Hugo, Burger, Manuel, Veshchezerova, Dinara, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning - Abstract
This study advances Early Event Prediction (EEP) in healthcare through Dynamic Survival Analysis (DSA), offering a novel approach by integrating risk localization into alarm policies to enhance clinical event metrics. By adapting and evaluating DSA models against traditional EEP benchmarks, our research demonstrates their ability to match EEP models on a time-step level and significantly improve event-level metrics through a new alarm prioritization scheme (up to 11% AuPRC difference). This approach represents a significant step forward in predictive healthcare, providing a more nuanced and actionable framework for early event prediction and management.
- Published
- 2024
5. Learning Genomic Sequence Representations using Graph Neural Networks over De Bruijn Graphs
- Author
-
Kapuśniak, Kacper, Burger, Manuel, Rätsch, Gunnar, and Joudaki, Amir
- Subjects
Computer Science - Machine Learning ,Quantitative Biology - Genomics - Abstract
The rapid expansion of genomic sequence data calls for new methods to achieve robust sequence representations. Existing techniques often neglect intricate structural details, emphasizing mainly contextual information. To address this, we developed k-mer embeddings that merge contextual and structural string information by enhancing De Bruijn graphs with structural similarity connections. Subsequently, we crafted a self-supervised method based on Contrastive Learning that employs a heterogeneous Graph Convolutional Network encoder and constructs positive pairs based on node similarities. Our embeddings consistently outperform prior techniques for Edit Distance Approximation and Closest String Retrieval tasks., Comment: Poster at "NeurIPS 2023 New Frontiers in Graph Learning Workshop (NeurIPS GLFrontiers 2023)"
- Published
- 2023
6. On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-Series
- Author
-
Kuznetsova, Rita, Pace, Alizée, Burger, Manuel, Yèche, Hugo, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning - Abstract
Recent advances in deep learning architectures for sequence modeling have not fully transferred to tasks handling time-series from electronic health records. In particular, in problems related to the Intensive Care Unit (ICU), the state-of-the-art remains to tackle sequence classification in a tabular manner with tree-based methods. Recent findings in deep learning for tabular data are now surpassing these classical methods by better handling the severe heterogeneity of data input features. Given the similar level of feature heterogeneity exhibited by ICU time-series and motivated by these findings, we explore these novel methods' impact on clinical sequence modeling tasks. By jointly using such advances in deep learning for tabular data, our primary objective is to underscore the importance of step-wise embeddings in time-series modeling, which remain unexplored in machine learning methods for clinical data. On a variety of clinically relevant tasks from two large-scale ICU datasets, MIMIC-III and HiRID, our work provides an exhaustive analysis of state-of-the-art methods for tabular time-series as time-step embedding models, showing overall performance improvement. In particular, we evidence the importance of feature grouping in clinical time-series, with significant performance gains when considering features within predefined semantic groups in the step-wise embedding module., Comment: Machine Learning for Health (ML4H) 2023 in Proceedings of Machine Learning Research 225
- Published
- 2023
7. Knowledge Graph Representations to enhance Intensive Care Time-Series Predictions
- Author
-
Jain, Samyak, Burger, Manuel, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Intensive Care Units (ICU) require comprehensive patient data integration for enhanced clinical outcome predictions, crucial for assessing patient conditions. Recent deep learning advances have utilized patient time series data, and fusion models have incorporated unstructured clinical reports, improving predictive performance. However, integrating established medical knowledge into these models has not yet been explored. The medical domain's data, rich in structural relationships, can be harnessed through knowledge graphs derived from clinical ontologies like the Unified Medical Language System (UMLS) for better predictions. Our proposed methodology integrates this knowledge with ICU data, improving clinical decision modeling. It combines graph representations with vital signs and clinical reports, enhancing performance, especially when data is missing. Additionally, our model includes an interpretability component to understand how knowledge graph nodes affect predictions., Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 11 pages
- Published
- 2023
8. Language Model Training Paradigms for Clinical Feature Embeddings
- Author
-
Hu, Yurong, Burger, Manuel, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
In research areas with scarce data, representation learning plays a significant role. This work aims to enhance representation learning for clinical time series by deriving universal embeddings for clinical features, such as heart rate and blood pressure. We use self-supervised training paradigms for language models to learn high-quality clinical feature embeddings, achieving a finer granularity than existing time-step and patient-level representation learning. We visualize the learnt embeddings via unsupervised dimension reduction techniques and observe a high degree of consistency with prior clinical knowledge. We also evaluate the model performance on the MIMIC-III benchmark and demonstrate the effectiveness of using clinical feature embeddings. We publish our code online for replication., Comment: Poster at "NeurIPS 2023 Workshop: Self-Supervised Learning - Theory and Practice"
- Published
- 2023
9. Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
- Author
-
Meterez, Alexandru, Joudaki, Amir, Orabona, Francesco, Immer, Alexander, Rätsch, Gunnar, and Daneshmand, Hadi
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Normalization layers are one of the key building blocks for deep neural networks. Several theoretical studies have shown that batch normalization improves the signal propagation, by avoiding the representations from becoming collinear across the layers. However, results on mean-field theory of batch normalization also conclude that this benefit comes at the expense of exploding gradients in depth. Motivated by these two aspects of batch normalization, in this study we pose the following question: "Can a batch-normalized network keep the optimal signal propagation properties, but avoid exploding gradients?" We answer this question in the affirmative by giving a particular construction of an Multi-Layer Perceptron (MLP) with linear activations and batch-normalization that provably has bounded gradients at any depth. Based on Weingarten calculus, we develop a rigorous and non-asymptotic theory for this constructed MLP that gives a precise characterization of forward signal propagation, while proving that gradients remain bounded for linearly independent input samples, which holds in most practical settings. Inspired by our theory, we also design an activation shaping scheme that empirically achieves the same properties for certain non-linear activations.
- Published
- 2023
10. Multi-modal Graph Learning over UMLS Knowledge Graphs
- Author
-
Burger, Manuel, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning - Abstract
Clinicians are increasingly looking towards machine learning to gain insights about patient evolutions. We propose a novel approach named Multi-Modal UMLS Graph Learning (MMUGL) for learning meaningful representations of medical concepts using graph neural networks over knowledge graphs based on the unified medical language system. These representations are aggregated to represent entire patient visits and then fed into a sequence model to perform predictions at the granularity of multiple hospital visits of a patient. We improve performance by incorporating prior medical knowledge and considering multiple modalities. We compare our method to existing architectures proposed to learn representations at different granularities on the MIMIC-III dataset and show that our approach outperforms these methods. The results demonstrate the significance of multi-modal medical concept representations based on prior medical knowledge., Comment: Machine Learning for Health (ML4H) 2023 in Proceedings of Machine Learning Research 225
- Published
- 2023
11. Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels
- Author
-
Immer, Alexander, van der Ouderaa, Tycho F. A., van der Wilk, Mark, Rätsch, Gunnar, and Schölkopf, Bernhard
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data. However, estimating a single hyperparameter gradient requires a pass through the entire dataset, limiting the scalability of such algorithms. In this work, we overcome this issue by introducing lower bounds to the linearized Laplace approximation of the marginal likelihood. In contrast to previous estimators, these bounds are amenable to stochastic-gradient-based optimization and allow to trade off estimation accuracy against computational complexity. We derive them using the function-space form of the linearized Laplace, which can be estimated using the neural tangent kernel. Experimentally, we show that the estimators can significantly accelerate gradient-based hyperparameter optimization., Comment: ICML 2023
- Published
- 2023
12. Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding
- Author
-
Pace, Alizée, Yèche, Hugo, Schölkopf, Bernhard, Rätsch, Gunnar, and Tennenholtz, Guy
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes. Hidden confounding can compromise the validity of any causal conclusion drawn from data and presents a major obstacle to effective offline RL. In the present paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to hidden confounding bias, termed delphic uncertainty, which uses variation over world models compatible with the observations, and differentiate it from the well-known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved confounders, and attempts to reduce the amount of confounding bias. We demonstrate through extensive experiments and ablations the efficacy of our approach on a sepsis management benchmark, as well as on electronic health records. Our results suggest that nonidentifiable hidden confounding bias can be mitigated to improve offline RL solutions in practice.
- Published
- 2023
13. Improving Neural Additive Models with Bayesian Principles
- Author
-
Bouchiat, Kouroche, Immer, Alexander, Yèche, Hugo, Rätsch, Gunnar, and Fortuin, Vincent
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks. However, they lack inherent mechanisms that provide calibrated uncertainties and enable selection of relevant features and interactions. Approaching NAMs from a Bayesian perspective, we augment them in three primary ways, namely by a) providing credible intervals for the individual additive sub-networks; b) estimating the marginal likelihood to perform an implicit selection of features via an empirical Bayes procedure; and c) facilitating the ranking of feature pairs as candidates for second-order interaction in fine-tuned models. In particular, we develop Laplace-approximated NAMs (LA-NAMs), which show improved empirical performance on tabular datasets and challenging real-world medical tasks., Comment: 41st International Conference on Machine Learning (ICML 2024)
- Published
- 2023
14. Modeling multiple sclerosis using mobile and wearable sensor data
- Author
-
Gashi, Shkurta, Oldrati, Pietro, Moebus, Max, Hilty, Marc, Barrios, Liliana, Ozdemir, Firat, Kana, Veronika, Lutterotti, Andreas, Rätsch, Gunnar, and Holz, Christian
- Published
- 2024
- Full Text
- View/download PDF
15. On the Importance of Clinical Notes in Multi-modal Learning for EHR Data
- Author
-
Husmann, Severin, Yèche, Hugo, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning - Abstract
Understanding deep learning model behavior is critical to accepting machine learning-based decision support systems in the medical community. Previous research has shown that jointly using clinical notes with electronic health record (EHR) data improved predictive performance for patient monitoring in the intensive care unit (ICU). In this work, we explore the underlying reasons for these improvements. While relying on a basic attention-based model to allow for interpretability, we first confirm that performance significantly improves over state-of-the-art EHR data models when combining EHR data and clinical notes. We then provide an analysis showing improvements arise almost exclusively from a subset of notes containing broader context on patient state rather than clinician notes. We believe such findings highlight deep learning models for EHR data to be more limited by partially-descriptive data than by modeling choice, motivating a more data-centric approach in the field., Comment: Workshop on Learning from Time Series for Health, 36th Conference on Neural Information Processing Systems (NeurIPS 2022) 15 pages (including appendices)
- Published
- 2022
16. Temporal Label Smoothing for Early Event Prediction
- Author
-
Yèche, Hugo, Pace, Alizée, Rätsch, Gunnar, and Kuznetsova, Rita
- Subjects
Computer Science - Machine Learning - Abstract
Models that can predict the occurrence of events ahead of time with low false-alarm rates are critical to the acceptance of decision support systems in the medical community. This challenging task is typically treated as a simple binary classification, ignoring temporal dependencies between samples, whereas we propose to exploit this structure. We first introduce a common theoretical framework unifying dynamic survival analysis and early event prediction. Following an analysis of objectives from both fields, we propose Temporal Label Smoothing (TLS), a simpler, yet best-performing method that preserves prediction monotonicity over time. By focusing the objective on areas with a stronger predictive signal, TLS improves performance over all baselines on two large-scale benchmark tasks. Gains are particularly notable along clinically relevant measures, such as event recall at low false-alarm rates. TLS reduces the number of missed events by up to a factor of two over previously used approaches in early event prediction.
- Published
- 2022
17. Learning single-cell perturbation responses using neural optimal transport
- Author
-
Bunne, Charlotte, Stark, Stefan G., Gut, Gabriele, del Castillo, Jacobo Sarabia, Levesque, Mitch, Lehmann, Kjong-Van, Pelkmans, Lucas, Krause, Andreas, and Rätsch, Gunnar
- Published
- 2023
- Full Text
- View/download PDF
18. Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization
- Author
-
Dresdner, Gideon, Vladarean, Maria-Luiza, Rätsch, Gunnar, Locatello, Francesco, Cevher, Volkan, and Yurtsever, Alp
- Subjects
Computer Science - Machine Learning ,Mathematics - Optimization and Control - Abstract
We propose a stochastic conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. Existing CGM variants for this template either suffer from slow convergence rates, or require carefully increasing the batch size over the course of the algorithm's execution, which leads to computing full gradients. In contrast, the proposed method, equipped with a stochastic average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques. In applications we put special emphasis on problems with a large number of separable constraints. Such problems are prevalent among semidefinite programming (SDP) formulations arising in machine learning and theoretical computer science. We provide numerical experiments on matrix completion, unsupervised clustering, and sparsest-cut SDPs., Comment: Artificial Intelligence and Statistics (AISTATS) 2022
- Published
- 2022
19. Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations
- Author
-
Immer, Alexander, van der Ouderaa, Tycho F. A., Rätsch, Gunnar, Fortuin, Vincent, and van der Wilk, Mark
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Data augmentation is commonly applied to improve performance of deep learning by enforcing the knowledge that certain transformations on the input preserve the output. Currently, the data augmentation parameters are chosen by human effort and costly cross-validation, which makes it cumbersome to apply to new datasets. We develop a convenient gradient-based method for selecting the data augmentation without validation data during training of a deep neural network. Our approach relies on phrasing data augmentation as an invariance in the prior distribution on the functions of a neural network, which allows us to learn it using Bayesian model selection. This has been shown to work in Gaussian processes, but not yet for deep neural networks. We propose a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective, which can be optimised without human supervision or validation data. We show that our method can successfully recover invariances present in the data, and that this improves generalisation and data efficiency on image datasets., Comment: NeurIPS 2022
- Published
- 2022
20. HiRID-ICU-Benchmark -- A Comprehensive Machine Learning Benchmark on High-resolution ICU Data
- Author
-
Yèche, Hugo, Kuznetsova, Rita, Zimmermann, Marc, Hüser, Matthias, Lyu, Xinrui, Faltys, Martin, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning - Abstract
The recent success of machine learning methods applied to time series collected from Intensive Care Units (ICU) exposes the lack of standardized machine learning benchmarks for developing and comparing such methods. While raw datasets, such as MIMIC-IV or eICU, can be freely accessed on Physionet, the choice of tasks and pre-processing is often chosen ad-hoc for each publication, limiting comparability across publications. In this work, we aim to improve this situation by providing a benchmark covering a large spectrum of ICU-related tasks. Using the HiRID dataset, we define multiple clinically relevant tasks in collaboration with clinicians. In addition, we provide a reproducible end-to-end pipeline to construct both data and labels. Finally, we provide an in-depth analysis of current state-of-the-art sequence modeling methods, highlighting some limitations of deep learning approaches for this type of data. With this benchmark, we hope to give the research community the possibility of a fair comparison of their work., Comment: NeurIPS 2021 (Datasets and Benchmarks)
- Published
- 2021
21. Neighborhood Contrastive Learning Applied to Online Patient Monitoring
- Author
-
Yèche, Hugo, Dresdner, Gideon, Locatello, Francesco, Hüser, Matthias, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning - Abstract
Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series., Comment: ICML 2021
- Published
- 2021
22. Boosting Variational Inference With Locally Adaptive Step-Sizes
- Author
-
Dresdner, Gideon, Shekhar, Saurav, Pedregosa, Fabian, Locatello, Francesco, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution. Instead, Boosting Variational Inference allows practitioners to obtain increasingly good posterior approximations by spending more compute. The main obstacle to widespread adoption of Boosting Variational Inference is the amount of resources necessary to improve over a strong Variational Inference baseline. In our work, we trace this limitation back to the global curvature of the KL-divergence. We characterize how the global curvature impacts time and memory consumption, address the problem with the notion of local curvature, and provide a novel approximate backtracking algorithm for estimating local curvature. We give new theoretical convergence rates for our algorithms and provide experimental validation on synthetic and real-world datasets.
- Published
- 2021
23. Early prediction of respiratory failure in the intensive care unit
- Author
-
Hüser, Matthias, Faltys, Martin, Lyu, Xinrui, Barber, Chris, Hyland, Stephanie L., Merz, Tobias M., and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The development of respiratory failure is common among patients in intensive care units (ICU). Large data quantities from ICU patient monitoring systems make timely and comprehensive analysis by clinicians difficult but are ideal for automatic processing by machine learning algorithms. Early prediction of respiratory system failure could alert clinicians to patients at risk of respiratory failure and allow for early patient reassessment and treatment adjustment. We propose an early warning system that predicts moderate/severe respiratory failure up to 8 hours in advance. Our system was trained on HiRID-II, a data-set containing more than 60,000 admissions to a tertiary care ICU. An alarm is typically triggered several hours before the beginning of respiratory failure. Our system outperforms a clinical baseline mimicking traditional clinical decision-making based on pulse-oximetric oxygen saturation and the fraction of inspired oxygen. To provide model introspection and diagnostics, we developed an easy-to-use web browser-based system to explore model input data and predictions visually., Comment: 14 pages, 5 figures
- Published
- 2021
24. Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning
- Author
-
Immer, Alexander, Bauer, Matthias, Fortuin, Vincent, Rätsch, Gunnar, and Khan, Mohammad Emtiyaz
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparameters can be estimated online during training, simplifying the procedure. Our marginal-likelihood estimate is based on Laplace's method and Gauss-Newton approximations to the Hessian, and it outperforms cross-validation and manual-tuning on standard regression and image classification datasets, especially in terms of calibration and out-of-distribution detection. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable (e.g., in nonstationary settings)., Comment: ICML 2021
- Published
- 2021
25. Bayesian Neural Network Priors Revisited
- Author
-
Fortuin, Vincent, Garriga-Alonso, Adrià, Ober, Sebastian W., Wenzel, Florian, Rätsch, Gunnar, Turner, Richard E., van der Wilk, Mark, and Aitchison, Laurence
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, it is unclear whether these priors accurately reflect our true beliefs about the weight distributions or give optimal performance. To find better priors, we study summary statistics of neural network weights in networks trained using stochastic gradient descent (SGD). We find that convolutional neural network (CNN) and ResNet weights display strong spatial correlations, while fully connected networks (FCNNs) display heavy-tailed weight distributions. We show that building these observations into priors can lead to improved performance on a variety of image classification datasets. Surprisingly, these priors mitigate the cold posterior effect in FCNNs, but slightly increase the cold posterior effect in ResNets., Comment: Accepted at ICLR 2022
- Published
- 2021
26. On Disentanglement in Gaussian Process Variational Autoencoders
- Author
-
Bing, Simon, Fortuin, Vincent, and Rätsch, Gunnar
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Complex multivariate time series arise in many fields, ranging from computer vision to robotics or medicine. Often we are interested in the independent underlying factors that give rise to the high-dimensional data we are observing. While many models have been introduced to learn such disentangled representations, only few attempt to explicitly exploit the structure of sequential data. We investigate the disentanglement properties of Gaussian process variational autoencoders, a class of models recently introduced that have been successful in different tasks on time series data. Our model exploits the temporal structure of the data by modeling each latent channel with a GP prior and employing a structured variational distribution that can capture dependencies in time. We demonstrate the competitiveness of our approach against state-of-the-art unsupervised and weakly-supervised disentanglement methods on a benchmark task. Moreover, we provide evidence that we can learn meaningful disentangled representations on real-world medical time series data.
- Published
- 2021
27. WRSE -- a non-parametric weighted-resolution ensemble for predicting individual survival distributions in the ICU
- Author
-
Heitz, Jonathan, Ficek, Joanna, Faltys, Martin, Merz, Tobias M., Rätsch, Gunnar, and Hüser, Matthias
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Dynamic assessment of mortality risk in the intensive care unit (ICU) can be used to stratify patients, inform about treatment effectiveness or serve as part of an early-warning system. Static risk scoring systems, such as APACHE or SAPS, have recently been supplemented with data-driven approaches that track the dynamic mortality risk over time. Recent works have focused on enhancing the information delivered to clinicians even further by producing full survival distributions instead of point predictions or fixed horizon risks. In this work, we propose a non-parametric ensemble model, Weighted Resolution Survival Ensemble (WRSE), tailored to estimate such dynamic individual survival distributions. Inspired by the simplicity and robustness of ensemble methods, the proposed approach combines a set of binary classifiers spaced according to a decay function reflecting the relevance of short-term mortality predictions. Models and baselines are evaluated under weighted calibration and discrimination metrics for individual survival distributions which closely reflect the utility of a model in ICU practice. We show competitive results with state-of-the-art probabilistic models, while greatly reducing training time by factors of 2-9x., Comment: 9 pages, 6 figures
- Published
- 2020
28. A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation
- Author
-
Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard, and Bachem, Olivier
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train over $14000$ models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on eight data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, different evaluation metrics do not always agree on what should be considered "disentangled" and exhibit systematic differences in the estimation. Finally, increased disentanglement does not seem to necessarily lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets., Comment: arXiv admin note: substantial text overlap with arXiv:1811.12359
- Published
- 2020
29. Scalable Gaussian Process Variational Autoencoders
- Author
-
Jazbec, Metod, Ashman, Matthew, Fortuin, Vincent, Pearce, Michael, Mandt, Stephan, and Rätsch, Gunnar
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Conventional variational autoencoders fail in modeling correlations between data points due to their use of factorized priors. Amortized Gaussian process inference through GP-VAEs has led to significant improvements in this regard, but is still inhibited by the intrinsic complexity of exact GP inference. We improve the scalability of these methods through principled sparse inference approaches. We propose a new scalable GP-VAE model that outperforms existing approaches in terms of runtime and memory footprint, is easy to implement, and allows for joint end-to-end optimization of all components., Comment: Published at AISTATS 2021
- Published
- 2020
30. Integrated multi-omics reveals anaplerotic rewiring in methylmalonyl-CoA mutase deficiency
- Author
-
Forny, Patrick, Bonilla, Ximena, Lamparter, David, Shao, Wenguang, Plessl, Tanja, Frei, Caroline, Bingisser, Anna, Goetze, Sandra, van Drogen, Audrey, Harshman, Keith, Pedrioli, Patrick G. A., Howald, Cedric, Poms, Martin, Traversi, Florian, Bürer, Céline, Cherkaoui, Sarah, Morscher, Raphael J., Simmons, Luke, Forny, Merima, Xenarios, Ioannis, Aebersold, Ruedi, Zamboni, Nicola, Rätsch, Gunnar, Dermitzakis, Emmanouil T., Wollscheid, Bernd, Baumgartner, Matthias R., and Froese, D. Sean
- Published
- 2023
- Full Text
- View/download PDF
31. A Commentary on the Unsupervised Learning of Disentangled Representations
- Author
-
Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard, and Bachem, Olivier
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases and the practical challenges it entails. Finally, we comment on our experimental findings, highlighting the limitations of state-of-the-art approaches and directions for future research.
- Published
- 2020
32. Weakly-Supervised Disentanglement Without Compromises
- Author
-
Locatello, Francesco, Poole, Ben, Rätsch, Gunnar, Schölkopf, Bernhard, Bachem, Olivier, and Tschannen, Michael
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Intelligent agents should be able to learn useful representations by observing changes in their environment. We model such observations as pairs of non-i.i.d. images sharing at least one of the underlying factors of variation. First, we theoretically show that only knowing how many factors have changed, but not which ones, is sufficient to learn disentangled representations. Second, we provide practical algorithms that learn disentangled representations from pairs of images without requiring annotation of groups, individual factors, or the number of factors that have changed. Third, we perform a large-scale empirical study and show that such pairs of observations are sufficient to reliably learn disentangled representations on several benchmark data sets. Finally, we evaluate our learned representations and find that they are simultaneously useful on a diverse suite of tasks, including generalization under covariate shifts, fairness, and abstract reasoning. Overall, our results demonstrate that weak supervision enables learning of useful disentangled representations in realistic scenarios., Comment: We updated the description of the generation of the dataset compared to the ICML version
- Published
- 2020
33. A global metagenomic map of urban microbiomes and antimicrobial resistance.
- Author
-
Danko, David, Bezdan, Daniela, Afshin, Evan E, Ahsanuddin, Sofia, Bhattacharya, Chandrima, Butler, Daniel J, Chng, Kern Rei, Donnellan, Daisy, Hecht, Jochen, Jackson, Katelyn, Kuchin, Katerina, Karasikov, Mikhail, Lyons, Abigail, Mak, Lauren, Meleshko, Dmitry, Mustafa, Harun, Mutai, Beth, Neches, Russell Y, Ng, Amanda, Nikolayeva, Olga, Nikolayeva, Tatyana, Png, Eileen, Ryon, Krista A, Sanchez, Jorge L, Shaaban, Heba, Sierra, Maria A, Thomas, Dominique, Young, Ben, Abudayyeh, Omar O, Alicea, Josue, Bhattacharyya, Malay, Blekhman, Ran, Castro-Nallar, Eduardo, Cañas, Ana M, Chatziefthimiou, Aspassia D, Crawford, Robert W, De Filippis, Francesca, Deng, Youping, Desnues, Christelle, Dias-Neto, Emmanuel, Dybwad, Marius, Elhaik, Eran, Ercolini, Danilo, Frolova, Alina, Gankin, Dennis, Gootenberg, Jonathan S, Graf, Alexandra B, Green, David C, Hajirasouliha, Iman, Hastings, Jaden JA, Hernandez, Mark, Iraola, Gregorio, Jang, Soojin, Kahles, Andre, Kelly, Frank J, Knights, Kaymisha, Kyrpides, Nikos C, Łabaj, Paweł P, Lee, Patrick KH, Leung, Marcus HY, Ljungdahl, Per O, Mason-Buck, Gabriella, McGrath, Ken, Meydan, Cem, Mongodin, Emmanuel F, Moraes, Milton Ozorio, Nagarajan, Niranjan, Nieto-Caballero, Marina, Noushmehr, Houtan, Oliveira, Manuela, Ossowski, Stephan, Osuolale, Olayinka O, Özcan, Orhan, Paez-Espino, David, Rascovan, Nicolás, Richard, Hugues, Rätsch, Gunnar, Schriml, Lynn M, Semmler, Torsten, Sezerman, Osman U, Shi, Leming, Shi, Tieliu, Siam, Rania, Song, Le Huu, Suzuki, Haruo, Court, Denise Syndercombe, Tighe, Scott W, Tong, Xinzhao, Udekwu, Klas I, Ugalde, Juan A, Valentine, Brandon, Vassilev, Dimitar I, Vayndorf, Elena M, Velavan, Thirumalaisamy P, Wu, Jun, Zambrano, María M, Zhu, Jifeng, Zhu, Sibo, Mason, Christopher E, and International MetaSUB Consortium
- Subjects
International MetaSUB Consortium ,AMR ,BGC ,NGS ,antimicrobial resistance ,built Environment ,de novo assembly ,global health ,metagenome ,microbiome ,shotgun sequencing ,Antimicrobial Resistance ,Biotechnology ,Human Genome ,Genetics ,Infection ,Developmental Biology ,Biological Sciences ,Medical and Health Sciences - Abstract
We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic elements, including 10,928 viruses, 1,302 bacteria, 2 archaea, and 838,532 CRISPR arrays not found in reference databases. We identified 4,246 known species of urban microorganisms and a consistent set of 31 species found in 97% of samples that were distinct from human commensal organisms. Profiles of AMR genes varied widely in type and density across cities. Cities showed distinct microbial taxonomic signatures that were driven by climate and geographic differences. These results constitute a high-resolution global metagenomic atlas that enables discovery of organisms and genes, highlights potential public health and forensic applications, and provides a culture-independent view of AMR burden in cities.
- Published
- 2021
34. Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons
- Author
-
Besta, Maciej, Kanakagiri, Raghavendra, Mustafa, Harun, Karasikov, Mikhail, Rätsch, Gunnar, Hoefler, Torsten, and Solomonik, Edgar
- Subjects
Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Performance ,Quantitative Biology - Genomics - Abstract
The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information retrieval, and many other areas. We design and implement SimilarityAtScale, the first communication-efficient distributed algorithm for computing the Jaccard similarity among pairs of large datasets. Our algorithm provides an efficient encoding of this problem into a multiplication of sparse matrices. Both the encoding and sparse matrix product are performed in a way that minimizes data movement in terms of communication and synchronization costs. We apply our algorithm to obtain similarity among all pairs of a set of large samples of genomes. This task is a key part of modern metagenomics analysis and an evergrowing need due to the increasing availability of high-throughput DNA sequencing data. The resulting scheme is the first to enable accurate Jaccard distance derivations for massive datasets, using largescale distributed-memory systems. We package our routines in a tool, called GenomeAtScale, that combines the proposed algorithm with tools for processing input sequences. Our evaluation on real data illustrates that one can use GenomeAtScale to effectively employ tens of thousands of processors to reach new frontiers in large-scale genomic and metagenomic analysis. While GenomeAtScale can be used to foster DNA research, the more general underlying SimilarityAtScale algorithm may be used for high-performance distributed similarity computations in other data analytics application domains.
- Published
- 2019
35. DPSOM: Deep Probabilistic Clustering with Self-Organizing Maps
- Author
-
Manduchi, Laura, Hüser, Matthias, Vogt, Julia, Rätsch, Gunnar, and Fortuin, Vincent
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Generating interpretable visualizations from complex data is a common problem in many applications. Two key ingredients for tackling this issue are clustering and representation learning. However, current methods do not yet successfully combine the strengths of these two approaches. Existing representation learning models which rely on latent topological structure such as self-organising maps, exhibit markedly lower clustering performance compared to recent deep clustering methods. To close this performance gap, we (a) present a novel way to fit self-organizing maps with probabilistic cluster assignments (PSOM), (b) propose a new deep architecture for probabilistic clustering (DPSOM) using a VAE, and (c) extend our architecture for time-series clustering (T-DPSOM), which also allows forecasting in the latent space using LSTMs. We show that DPSOM achieves superior clustering performance compared to current deep clustering methods on MNIST/Fashion-MNIST, while maintaining the favourable visualization properties of SOMs. On medical time series, we show that T-DPSOM outperforms baseline methods in time series clustering and time series forecasting, while providing interpretable visualizations of patient state trajectories and uncertainty estimation.
- Published
- 2019
36. Author Correction: Learning single-cell perturbation responses using neural optimal transport
- Author
-
Bunne, Charlotte, Stark, Stefan G., Gut, Gabriele, del Castillo, Jacobo Sarabia, Levesque, Mitch, Lehmann, Kjong-Van, Pelkmans, Lucas, Krause, Andreas, and Rätsch, Gunnar
- Published
- 2023
- Full Text
- View/download PDF
37. META$^\mathbf{2}$: Memory-efficient taxonomic classification and abundance estimation for metagenomics with deep learning
- Author
-
Georgiou, Andreas, Fortuin, Vincent, Mustafa, Harun, and Rätsch, Gunnar
- Subjects
Quantitative Biology - Genomics ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Metagenomic studies have increasingly utilized sequencing technologies in order to analyze DNA fragments found in environmental samples.One important step in this analysis is the taxonomic classification of the DNA fragments. Conventional read classification methods require large databases and vast amounts of memory to run, with recent deep learning methods suffering from very large model sizes. We therefore aim to develop a more memory-efficient technique for taxonomic classification. A task of particular interest is abundance estimation in metagenomic samples. Current attempts rely on classifying single DNA reads independently from each other and are therefore agnostic to co-occurence patterns between taxa. In this work, we also attempt to take these patterns into account. We develop a novel memory-efficient read classification technique, combining deep learning and locality-sensitive hashing. We show that this approach outperforms conventional mapping-based and other deep learning methods for single-read taxonomic classification when restricting all methods to a fixed memory footprint. Moreover, we formulate the task of abundance estimation as a Multiple Instance Learning (MIL) problem and we extend current deep learning architectures with two different types of permutation-invariant MIL pooling layers: a) deepsets and b) attention-based pooling. We illustrate that our architectures can exploit the co-occurrence of species in metagenomic read sets and outperform the single-read architectures in predicting the distribution over taxa at higher taxonomic ranks.
- Published
- 2019
38. GP-VAE: Deep Probabilistic Time Series Imputation
- Author
-
Fortuin, Vincent, Baranchuk, Dmitry, Rätsch, Gunnar, and Mandt, Stephan
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Multivariate time series with missing values are common in areas such as healthcare and finance, and have grown in number and complexity over the years. This raises the question whether deep learning methodologies can outperform classical data imputation methods in this domain. However, naive applications of deep learning fall short in giving reliable confidence estimates and lack interpretability. We propose a new deep sequential latent variable model for dimensionality reduction and data imputation. Our modeling assumption is simple and interpretable: the high dimensional time series has a lower-dimensional representation which evolves smoothly in time according to a Gaussian process. The non-linear dimensionality reduction in the presence of missing data is achieved using a VAE approach with a novel structured variational approximation. We demonstrate that our approach outperforms several classical and deep learning-based data imputation methods on high-dimensional data from the domains of computer vision and healthcare, while additionally improving the smoothness of the imputations and providing interpretable uncertainty estimates., Comment: Accepted for publication at the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
- Published
- 2019
39. Disentangling Factors of Variation Using Few Labels
- Author
-
Locatello, Francesco, Tschannen, Michael, Bauer, Stefan, Rätsch, Gunnar, Schölkopf, Bernhard, and Bachem, Olivier
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Learning disentangled representations is considered a cornerstone problem in representation learning. Recently, Locatello et al. (2019) demonstrated that unsupervised disentanglement learning without inductive biases is theoretically impossible and that existing inductive biases and unsupervised methods do not allow to consistently learn disentangled representations. However, in many practical settings, one might have access to a limited amount of supervision, for example through manual labeling of (some) factors of variation in a few training examples. In this paper, we investigate the impact of such supervision on state-of-the-art disentanglement methods and perform a large scale study, training over 52000 models under well-defined and reproducible experimental conditions. We observe that a small number of labeled examples (0.01--0.5\% of the data set), with potentially imprecise and incomplete labels, is sufficient to perform model selection on state-of-the-art unsupervised models. Further, we investigate the benefit of incorporating supervision into the training process. Overall, we empirically validate that with little and imprecise supervision it is possible to reliably learn disentangled representations.
- Published
- 2019
40. Unsupervised Extraction of Phenotypes from Cancer Clinical Notes for Association Studies
- Author
-
Stark, Stefan G., Hyland, Stephanie L., Pradier, Melanie F., Lehmann, Kjong, Wicki, Andreas, Cruz, Fernando Perez, Vogt, Julia E., and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language ,Statistics - Applications ,Statistics - Machine Learning - Abstract
The recent adoption of Electronic Health Records (EHRs) by health care providers has introduced an important source of data that provides detailed and highly specific insights into patient phenotypes over large cohorts. These datasets, in combination with machine learning and statistical approaches, generate new opportunities for research and clinical care. However, many methods require the patient representations to be in structured formats, while the information in the EHR is often locked in unstructured texts designed for human readability. In this work, we develop the methodology to automatically extract clinical features from clinical narratives from large EHR corpora without the need for prior knowledge. We consider medical terms and sentences appearing in clinical narratives as atomic information units. We propose an efficient clustering strategy suitable for the analysis of large text corpora and to utilize the clusters to represent information about the patient compactly. To demonstrate the utility of our approach, we perform an association study of clinical features with somatic mutation profiles from 4,007 cancer patients and their tumors. We apply the proposed algorithm to a dataset consisting of about 65 thousand documents with a total of about 3.2 million sentences. We identify 341 significant statistical associations between the presence of somatic mutations and clinical features. We annotated these associations according to their novelty, and report several known associations. We also propose 32 testable hypotheses where the underlying biological mechanism does not appear to be known but plausible. These results illustrate that the automated discovery of clinical features is possible and the joint analysis of clinical and genetic datasets can generate appealing new hypotheses.
- Published
- 2019
41. Machine learning for early prediction of circulatory failure in the intensive care unit
- Author
-
Hyland, Stephanie L., Faltys, Martin, Hüser, Matthias, Lyu, Xinrui, Gumbsch, Thomas, Esteban, Cristóbal, Bock, Christian, Horn, Max, Moor, Michael, Rieck, Bastian, Zimmermann, Marc, Bodenham, Dean, Borgwardt, Karsten, Rätsch, Gunnar, and Merz, Tobias M.
- Subjects
Computer Science - Machine Learning ,Statistics - Applications ,Statistics - Machine Learning - Abstract
Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and act on early signs of patient deterioration. We used machine learning to develop an early warning system for circulatory failure based on a high-resolution ICU database with 240 patient years of data. This automatic system predicts 90.0% of circulatory failure events (prevalence 3.1%), with 81.8% identified more than two hours in advance, resulting in an area under the receiver operating characteristic curve of 94.0% and area under the precision-recall curve of 63.0%. The model was externally validated in a large independent patient cohort., Comment: 5 main figures, 1 main table, 13 supplementary figures, 5 supplementary tables; 250ppi images
- Published
- 2019
42. Meta-Learning Mean Functions for Gaussian Processes
- Author
-
Fortuin, Vincent, Strathmann, Heiko, and Rätsch, Gunnar
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
When fitting Bayesian machine learning models on scarce data, the main challenge is to obtain suitable prior knowledge and encode it into the model. Recent advances in meta-learning offer powerful methods for extracting such prior knowledge from data acquired in related tasks. When it comes to meta-learning in Gaussian process models, approaches in this setting have mostly focused on learning the kernel function of the prior, but not on learning its mean function. In this work, we explore meta-learning the mean function of a Gaussian process prior. We present analytical and empirical evidence that mean function learning can be useful in the meta-learning setting, discuss the risk of overfitting, and draw connections to other meta-learning approaches, such as model agnostic meta-learning and functional PCA.
- Published
- 2019
43. Genomic basis for RNA alterations in cancer.
- Author
-
PCAWG Transcriptome Core Group, Calabrese, Claudia, Davidson, Natalie R, Demircioğlu, Deniz, Fonseca, Nuno A, He, Yao, Kahles, André, Lehmann, Kjong-Van, Liu, Fenglin, Shiraishi, Yuichi, Soulette, Cameron M, Urban, Lara, Greger, Liliana, Li, Siliang, Liu, Dongbing, Perry, Marc D, Xiang, Qian, Zhang, Fan, Zhang, Junjun, Bailey, Peter, Erkek, Serap, Hoadley, Katherine A, Hou, Yong, Huska, Matthew R, Kilpinen, Helena, Korbel, Jan O, Marin, Maximillian G, Markowski, Julia, Nandi, Tannistha, Pan-Hammarström, Qiang, Pedamallu, Chandra Sekhar, Siebert, Reiner, Stark, Stefan G, Su, Hong, Tan, Patrick, Waszak, Sebastian M, Yung, Christina, Zhu, Shida, Awadalla, Philip, Creighton, Chad J, Meyerson, Matthew, Ouellette, BF Francis, Wu, Kui, Yang, Huanming, PCAWG Transcriptome Working Group, Brazma, Alvis, Brooks, Angela N, Göke, Jonathan, Rätsch, Gunnar, Schwarz, Roland F, Stegle, Oliver, Zhang, Zemin, and PCAWG Consortium
- Subjects
PCAWG Transcriptome Core Group ,PCAWG Transcriptome Working Group ,PCAWG Consortium ,Humans ,Neoplasms ,DNA ,Neoplasm ,RNA ,Genomics ,Gene Expression Regulation ,Neoplastic ,Genome ,Human ,DNA Copy Number Variations ,Transcriptome ,DNA ,Neoplasm ,Gene Expression Regulation ,Neoplastic ,Genome ,Human ,General Science & Technology - Abstract
Transcript alterations often result from somatic changes in cancer genomes1. Various forms of RNA alterations have been described in cancer, including overexpression2, altered splicing3 and gene fusions4; however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom samples have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)5. Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis, of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed 'bridged' fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer.
- Published
- 2020
44. Improving Clinical Predictions through Unsupervised Time Series Representation Learning
- Author
-
Lyu, Xinrui, Hueser, Matthias, Hyland, Stephanie L., Zerveas, George, and Raetsch, Gunnar
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this work, we investigate unsupervised representation learning on medical time series, which bears the promise of leveraging copious amounts of existing unlabeled data in order to eventually assist clinical decision making. By evaluating on the prediction of clinically relevant outcomes, we show that in a practical setting, unsupervised representation learning can offer clear performance benefits over end-to-end supervised architectures. We experiment with using sequence-to-sequence (Seq2Seq) models in two different ways, as an autoencoder and as a forecaster, and show that the best performance is achieved by a forecasting Seq2Seq model with an integrated attention mechanism, proposed here for the first time in the setting of unsupervised learning for medical time series., Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
- Published
- 2018
45. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
- Author
-
Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard, and Bachem, Olivier
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. We observe that while the different methods successfully enforce properties ``encouraged'' by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.
- Published
- 2018
46. Scalable Gaussian Processes on Discrete Domains
- Author
-
Fortuin, Vincent, Dresdner, Gideon, Strathmann, Heiko, and Rätsch, Gunnar
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Kernel methods on discrete domains have shown great promise for many challenging data types, for instance, biological sequence data and molecular structure data. Scalable kernel methods like Support Vector Machines may offer good predictive performances but do not intrinsically provide uncertainty estimates. In contrast, probabilistic kernel methods like Gaussian Processes offer uncertainty estimates in addition to good predictive performance but fall short in terms of scalability. While the scalability of Gaussian processes can be improved using sparse inducing point approximations, the selection of these inducing points remains challenging. We explore different techniques for selecting inducing points on discrete domains, including greedy selection, determinantal point processes, and simulated annealing. We find that simulated annealing, which can select inducing points that are not in the training set, can perform competitively with support vector machines and full Gaussian processes on synthetic data, as well as on challenging real-world DNA sequence data., Comment: Published at IEEE Access
- Published
- 2018
- Full Text
- View/download PDF
47. PipeIT2: A tumor-only somatic variant calling workflow for molecular diagnostic Ion Torrent sequencing data
- Author
-
Schnidrig, Desiree, Garofoli, Andrea, Benjak, Andrej, Rätsch, Gunnar, Rubin, Mark A., Piscuoglio, Salvatore, and Ng, Charlotte K.Y.
- Published
- 2023
- Full Text
- View/download PDF
48. SimReadUntil for Benchmarking Selective Sequencing Algorithms on ONT Devices
- Author
-
Mordig, Maximilian, primary, Rätsch, Gunnar, additional, and Kahles, André, additional
- Published
- 2024
- Full Text
- View/download PDF
49. SOM-VAE: Interpretable Discrete Representation Learning on Time Series
- Author
-
Fortuin, Vincent, Hüser, Matthias, Locatello, Francesco, Strathmann, Heiko, and Rätsch, Gunnar
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling. This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set. Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data., Comment: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019)
- Published
- 2018
50. Boosting Black Box Variational Inference
- Author
-
Locatello, Francesco, Dresdner, Gideon, Khanna, Rajiv, Valera, Isabel, and Rätsch, Gunnar
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing ideas from the classic boosting framework, recent approaches attempt to \emph{boost} VI by replacing the selection of a single density with a greedily constructed mixture of densities. In order to guarantee convergence, previous works impose stringent assumptions that require significant effort for practitioners. Specifically, they require a custom implementation of the greedy step (called the LMO) for every probabilistic model with respect to an unnatural variational family of truncated distributions. Our work fixes these issues with novel theoretical and algorithmic insights. On the theoretical side, we show that boosting VI satisfies a relaxed smoothness assumption which is sufficient for the convergence of the functional Frank-Wolfe (FW) algorithm. Furthermore, we rephrase the LMO problem and propose to maximize the Residual ELBO (RELBO) which replaces the standard ELBO optimization in VI. These theoretical enhancements allow for black box implementation of the boosting subroutine. Finally, we present a stopping criterion drawn from the duality gap in the classic FW analyses and exhaustive experiments to illustrate the usefulness of our theoretical and algorithmic contributions.
- Published
- 2018
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.