Search

Your search keyword '"Amodei, Dario"' showing total 17 results

Search Constraints

Start Over You searched for: Author "Amodei, Dario" Remove constraint Author: "Amodei, Dario" Topic computer science - machine learning Remove constraint Topic: computer science - machine learning
17 results on '"Amodei, Dario"'

Search Results

1. Discovering Language Model Behaviors with Model-Written Evaluations

2. In-context Learning and Induction Heads

3. Toy Models of Superposition

4. Language Models (Mostly) Know What They Know

5. Scaling Laws and Interpretability of Learning from Repeated Data

6. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

7. A General Language Assistant as a Laboratory for Alignment

8. Evaluating Large Language Models Trained on Code

9. Scaling Laws for Autoregressive Generative Modeling

10. Learning to summarize from human feedback

11. Scaling Laws for Neural Language Models

12. Fine-Tuning Language Models from Human Preferences

13. An Empirical Model of Large-Batch Training

14. Reward learning from human preferences and demonstrations in Atari

15. Supervising strong learners by amplifying weak experts

16. AI safety via debate

17. Deep reinforcement learning from human preferences

Catalog

Books, media, physical & digital resources