Search

Your search keyword '"Amodei, Dario"' showing total 13 results

Search Constraints

Start Over You searched for: Author "Amodei, Dario" Remove constraint Author: "Amodei, Dario" Topic artificial intelligence (cs.ai) Remove constraint Topic: artificial intelligence (cs.ai)
13 results on '"Amodei, Dario"'

Search Results

1. Constitutional AI: Harmlessness from AI Feedback

2. Measuring Progress on Scalable Oversight for Large Language Models

3. Language Models (Mostly) Know What They Know

4. Scaling Laws and Interpretability of Learning from Repeated Data

5. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

6. Discovering Language Model Behaviors with Model-Written Evaluations

7. Learning to summarize from human feedback

8. Reward learning from human preferences and demonstrations in Atari

9. Supervising strong learners by amplifying weak experts

10. Variational Option Discovery Algorithms

11. The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

12. Deep reinforcement learning from human preferences

13. Concrete Problems in AI Safety

Catalog

Books, media, physical & digital resources