Author: "Krueger, David" / Topic: statistics - machine learning - Searchworks@Jio Institute Digital Library Search Results

1. Integrating uncertainty quantification into randomized smoothing based robustness guarantees

Author: Däubener, Sina, Maag, Kira, Krueger, David, and Fischer, Asja
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Statistics - Machine Learning
Abstract: Deep neural networks have proven to be extremely powerful, however, they are also vulnerable to adversarial attacks which can cause hazardous incorrect predictions in safety-critical applications. Certified robustness via randomized smoothing gives a probabilistic guarantee that the smoothed classifier's predictions will not change within an $\ell_2$-ball around a given input. On the other hand (uncertainty) score-based rejection is a technique often applied in practice to defend models against adversarial attacks. In this work, we fuse these two approaches by integrating a classifier that abstains from predicting when uncertainty is high into the certified robustness framework. This allows us to derive two novel robustness guarantees for uncertainty aware classifiers, namely (i) the radius of an $\ell_2$-ball around the input in which the same label is predicted and uncertainty remains low and (ii) the $\ell_2$-radius of a ball in which the predictions will either not change or be uncertain. While the former provides robustness guarantees with respect to attacks aiming at increased uncertainty, the latter informs about the amount of input perturbation necessary to lead the uncertainty aware model into a wrong prediction. Notably, this is on CIFAR10 up to 20.93% larger than for models not allowing for uncertainty based rejection. We demonstrate, that the novel framework allows for a systematic robustness evaluation of different network architectures and uncertainty measures and to identify desired properties of uncertainty quantification techniques. Moreover, we show that leveraging uncertainty in a smoothed classifier helps out-of-distribution detection.
Published: 2024

2. The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

Author: Fluri, Lukas, Lang, Leon, Abate, Alessandro, Forré, Patrick, Krueger, David, and Skalse, Joar
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: In reinforcement learning, specifying reward functions that capture the intended task can be very challenging. Reward learning aims to address this issue by learning the reward function. However, a learned reward model may have a low error on the training distribution, and yet subsequently produce a policy with large regret. We say that such a reward model has an error-regret mismatch. The main source of an error-regret mismatch is the distributional shift that commonly occurs during policy optimization. In this paper, we mathematically show that a sufficiently low expected test error of the reward model guarantees low worst-case regret, but that for any fixed expected test error, there exist realistic data distributions that allow for error-regret mismatch to occur. We then show that similar problems persist even when using policy regularization techniques, commonly employed in methods such as RLHF. Our theoretical results highlight the importance of developing new ways to measure the quality of learned reward models., Comment: 58 pages, 1 figure
Published: 2024

3. Domain Generalization for Robust Model-Based Offline Reinforcement Learning

Author: Clark, Alan, Siddiqui, Shoaib Ahmed, Kirk, Robert, Anwar, Usman, Chung, Stephen, and Krueger, David
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin. We consider multi-demonstrator offline RL, a middle ground where we know which demonstrators generated each dataset, but make no assumptions about the underlying policies of the demonstrators. This is the most natural setting when collecting data from multiple human operators, yet remains unexplored. Since different demonstrators induce different data distributions, we show that this can be naturally framed as a domain generalization problem, with each demonstrator corresponding to a different domain. Specifically, we propose Domain-Invariant Model-based Offline RL (DIMORL), where we apply Risk Extrapolation (REx) (Krueger et al., 2020) to the process of learning dynamics and rewards models. Our results show that models trained with REx exhibit improved domain generalization performance when compared with the natural baseline of pooling all demonstrators' data. We observe that the resulting models frequently enable the learning of superior policies in the offline model-based RL setting, can improve the stability of the policy learning process, and potentially enable increased exploration., Comment: Accepted to the NeurIPS 2022 Workshops on Distribution Shifts and Offline Reinforcement Learning
Published: 2022

4. Defining and Characterizing Reward Hacking

Author: Skalse, Joar, Howe, Nikolaus H. R., Krasheninnikov, Dmitrii, and Krueger, David
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We provide the first formal definition of reward hacking, a phenomenon where optimizing an imperfect proxy reward function, $\mathcal{\tilde{R}}$, leads to poor performance according to the true reward function, $\mathcal{R}$. We say that a proxy is unhackable if increasing the expected proxy return can never decrease the expected true return. Intuitively, it might be possible to create an unhackable proxy by leaving some terms out of the reward function (making it "narrower") or overlooking fine-grained distinctions between roughly equivalent outcomes, but we show this is usually not the case. A key insight is that the linearity of reward (in state-action visit counts) makes unhackability a very strong condition. In particular, for the set of all stochastic policies, two reward functions can only be unhackable if one of them is constant. We thus turn our attention to deterministic policies and finite sets of stochastic policies, where non-trivial unhackable pairs always exist, and establish necessary and sufficient conditions for the existence of simplifications, an important special case of unhackability. Our results reveal a tension between using reward functions to specify narrow tasks and aligning AI systems with human values.
Published: 2022

5. Active Reinforcement Learning: Observing Rewards at a Cost

Author: Krueger, David, Leike, Jan, Evans, Owain, and Salvatier, John
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate several heuristic approaches for ARL in multi-armed bandits and (tabular) Markov decision processes, and discuss and illustrate some challenging aspects of the ARL problem., Comment: Originally appeared at the NeurIPS 2016 "Future of Interactive Learning Machines (FILM)" workshop
Published: 2020

6. Hidden Incentives for Auto-Induced Distributional Shift

Author: Krueger, David, Maharaj, Tegan, and Leike, Jan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Decisions made by machine learning systems have increasing influence on the world, yet it is common for machine learning algorithms to assume that no such influence exists. An example is the use of the i.i.d. assumption in content recommendation. In fact, the (choice of) content displayed can change users' perceptions and preferences, or even drive them away, causing a shift in the distribution of users. We introduce the term auto-induced distributional shift (ADS) to describe the phenomenon of an algorithm causing a change in the distribution of its own inputs. Our goal is to ensure that machine learning systems do not leverage ADS to increase performance when doing so could be undesirable. We demonstrate that changes to the learning algorithm, such as the introduction of meta-learning, can cause hidden incentives for auto-induced distributional shift (HI-ADS) to be revealed. To address this issue, we introduce `unit tests' and a mitigation strategy for HI-ADS, as well as a toy environment for modelling real-world issues with HI-ADS in content recommendation, where we demonstrate that strong meta-learners achieve gains in performance via ADS. We show meta-learning and Q-learning both sometimes fail unit tests, but pass when using our mitigation strategy.
Published: 2020

7. Out-of-Distribution Generalization via Risk Extrapolation (REx)

Author: Krueger, David, Caballero, Ethan, Jacobsen, Joern-Henrik, Zhang, Amy, Binas, Jonathan, Zhang, Dinghuai, Priol, Remi Le, and Courville, Aaron
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, Statistics - Machine Learning
Abstract: Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world. To tackle this problem, we assume that variation across training domains is representative of the variation we might encounter at test time, but also that shifts at test time may be more extreme in magnitude. In particular, we show that reducing differences in risk across training domains can reduce a model's sensitivity to a wide range of extreme distributional shifts, including the challenging setting where the input contains both causal and anti-causal elements. We motivate this approach, Risk Extrapolation (REx), as a form of robust optimization over a perturbation set of extrapolated domains (MM-REx), and propose a penalty on the variance of training risks (V-REx) as a simpler variant. We prove that variants of REx can recover the causal mechanisms of the targets, while also providing some robustness to changes in the input distribution ("covariate shift"). By appropriately trading-off robustness to causally induced distributional shifts and covariate shift, REx is able to outperform alternative methods such as Invariant Risk Minimization in situations where these types of shift co-occur.
Published: 2020

8. Scalable agent alignment via reward modeling: a research direction

Author: Leike, Jan, Krueger, David, Everitt, Tom, Martic, Miljan, Maini, Vishal, and Legg, Shane
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, Statistics - Machine Learning
Abstract: One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions. Designing such reward functions is difficult in part because the user only has an implicit understanding of the task objective. This gives rise to the agent alignment problem: how do we create agents that behave in accordance with the user's intentions? We outline a high-level research direction to solve the agent alignment problem centered around reward modeling: learning a reward function from interaction with the user and optimizing the learned reward function with reinforcement learning. We discuss the key challenges we expect to face when scaling reward modeling to complex and general domains, concrete approaches to mitigate these challenges, and ways to establish trust in the resulting agents.
Published: 2018

9. Uncertainty in Multitask Transfer Learning

Author: Lacoste, Alexandre, Oreshkin, Boris, Chung, Wonchang, Boquet, Thomas, Rostamzadeh, Negar, and Krueger, David
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Using variational Bayes neural networks, we develop an algorithm capable of accumulating knowledge into a prior from multiple different tasks. The result is a rich and meaningful prior capable of few-shot learning on new tasks. The posterior can go beyond the mean field approximation and yields good uncertainty on the performed experiments. Analysis on toy tasks shows that it can learn from significantly different tasks while finding similarities among them. Experiments of Mini-Imagenet yields the new state of the art with 74.5% accuracy on 5 shot learning. Finally, we provide experiments showing that other existing methods can fail to perform well in different benchmarks.
Published: 2018

10. Neural Autoregressive Flows

Author: Huang, Chin-Wei, Krueger, David, Lacoste, Alexandre, and Courville, Aaron
Subjects: Computer Science - Learning, Statistics - Machine Learning
Abstract: Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF). We unify and generalize these approaches, replacing the (conditionally) affine univariate transformations of MAF/IAF with a more general class of invertible univariate transformations expressed as monotonic neural networks. We demonstrate that the proposed neural autoregressive flows (NAF) are universal approximators for continuous probability distributions, and their greater expressivity allows them to better capture multimodal target distributions. Experimentally, NAF yields state-of-the-art performance on a suite of density estimation tasks and outperforms IAF in variational autoencoders trained on binarized MNIST., Comment: 16 pages, 10 figures, 3 tables
Published: 2018

11. Deep Prior

Author: Lacoste, Alexandre, Boquet, Thomas, Rostamzadeh, Negar, Oreshkin, Boris, Chung, Wonchang, and Krueger, David
Subjects: Statistics - Machine Learning, Computer Science - Learning
Abstract: The recent literature on deep learning offers new tools to learn a rich probability distribution over high dimensional data such as images or sounds. In this work we investigate the possibility of learning the prior distribution over neural network parameters using such tools. Our resulting variational Bayes algorithm generalizes well to new tasks, even when very few training examples are provided. Furthermore, this learned prior allows the model to extrapolate correctly far from a given task's training data on a meta-dataset of periodic signals., Comment: Workshop paper, Accepted at Bayesian Deep Learning workshop, NIPS 2017
Published: 2017

12. Bayesian Hypernetworks

Author: Krueger, David, Huang, Chin-Wei, Islam, Riashat, Turner, Ryan, Lacoste, Alexandre, and Courville, Aaron
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Learning
Abstract: We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vec\epsilon) = \N(\vec 0,\mat I)$, to a distribution $q(\pp) := q(h(\vec\epsilon))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~$q(\pp)$. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection., Comment: David Krueger and Chin-Wei Huang contributed equally
Published: 2017

13. A Closer Look at Memorization in Deep Networks

Author: Arpit, Devansh, Jastrzębski, Stanisław, Ballas, Nicolas, Krueger, David, Bengio, Emmanuel, Kanwal, Maxinder S., Maharaj, Tegan, Fischer, Asja, Courville, Aaron, Bengio, Yoshua, and Lacoste-Julien, Simon
Subjects: Statistics - Machine Learning, Computer Science - Learning
Abstract: We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. real data. We also demonstrate that for appropriately tuned explicit regularization (e.g., dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization., Comment: Appears in Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Devansh Arpit, Stanis{\l}aw Jastrz\k{e}bski, Nicolas Ballas, and David Krueger contributed equally to this work
Published: 2017

14. Regularizing RNNs by Stabilizing Activations

Author: Krueger, David and Memisevic, Roland
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Computation and Language, Computer Science - Learning, Statistics - Machine Learning
Abstract: We stabilize the activations of Recurrent Neural Networks (RNNs) by penalizing the squared distance between successive hidden states' norms. This penalty term is an effective regularizer for RNNs including LSTMs and IRNNs, improving performance on character-level language modeling and phoneme recognition, and outperforming weight noise and dropout. We achieve competitive performance (18.6\% PER) on the TIMIT phoneme recognition task for RNNs evaluated without beam search or an RNN transducer. With this penalty term, IRNN can achieve similar performance to LSTM on language modeling, although adding the penalty term to the LSTM results in superior performance. Our penalty term also prevents the exponential growth of IRNN's activations outside of their training horizon, allowing them to generalize to much longer sequences.
Published: 2015

15. Zero-bias autoencoders and the benefits of co-adapting features

Author: Konda, Kishore, Memisevic, Roland, and Krueger, David
Subjects: Statistics - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing
Abstract: Regularized training of an autoencoder typically results in hidden unit biases that take on large negative values. We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation. We then show that negative biases impede the learning of data distributions whose intrinsic dimensionality is high. We also propose a new activation function that decouples the two roles of the hidden layer and that allows us to learn representations on data with very high intrinsic dimensionality, where standard autoencoders typically fail. Since the decoupled activation function acts like an implicit regularizer, the model can be trained by minimizing the reconstruction error of training data, without requiring any additional regularization.
Published: 2014

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

15 results on '"Krueger, David"'

1. Integrating uncertainty quantification into randomized smoothing based robustness guarantees

2. The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

3. Domain Generalization for Robust Model-Based Offline Reinforcement Learning

4. Defining and Characterizing Reward Hacking

5. Active Reinforcement Learning: Observing Rewards at a Cost

6. Hidden Incentives for Auto-Induced Distributional Shift

7. Out-of-Distribution Generalization via Risk Extrapolation (REx)

8. Scalable agent alignment via reward modeling: a research direction

9. Uncertainty in Multitask Transfer Learning

10. Neural Autoregressive Flows

11. Deep Prior

12. Bayesian Hypernetworks

13. A Closer Look at Memorization in Deep Networks

14. Regularizing RNNs by Stabilizing Activations

15. Zero-bias autoencoders and the benefits of co-adapting features

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

15 results on '"Krueger, David"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources