Search

Your search keyword '"Legg, Shane"' showing total 30 results

Search Constraints

Start Over You searched for: Author "Legg, Shane" Remove constraint Author: "Legg, Shane" Topic computer science - machine learning Remove constraint Topic: computer science - machine learning
30 results on '"Legg, Shane"'

Search Results

1. Scaling Instructable Agents Across Many Simulated Worlds

2. The Hydra Effect: Emergent Self-repair in Language Model Computations

3. Randomized Positional Encodings Boost Length Generalization of Transformers

4. Beyond Bayes-optimality: meta-learning what you know you don't know

5. Neural Networks and the Chomsky Hierarchy

6. Your Policy Regularizer is Secretly an Adversary

7. Safe Deep RL in 3D Environments using Human Feedback

8. Model-Free Risk-Sensitive Reinforcement Learning

9. Shaking the foundations: delusions in sequence models for interaction and control

10. Causal Analysis of Agent Behavior for AI Safety

11. Agent Incentives: A Causal Perspective

12. Avoiding Tampering Incentives in Deep RL via Decoupled Approval

13. REALab: An Embedded Perspective on Tampering

14. Algorithms for Causal Reasoning in Probability Trees

15. Meta-trained agents implement Bayes-optimal agents

16. Avoiding Side Effects By Considering Future Tasks

17. Quantifying Differences in Reward Functions

18. The Incentives that Shape Behaviour

19. Learning Human Objectives by Evaluating Hypothetical Behavior

20. Meta-learning of Sequential Strategies

21. Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings

22. Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

23. Scaling shared model governance via model splitting

24. Scalable agent alignment via reward modeling: a research direction

25. Reward learning from human preferences and demonstrations in Atari

26. Penalizing side effects using stepwise relative reachability

27. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

28. Noisy Networks for Exploration

29. Deep reinforcement learning from human preferences

30. Causal Analysis of Agent Behavior for AI Safety

Catalog

Books, media, physical & digital resources