Search

Your search keyword '"Korbak, Tomasz"' showing total 42 results

Search Constraints

Start Over You searched for: Author "Korbak, Tomasz" Remove constraint Author: "Korbak, Tomasz"
42 results on '"Korbak, Tomasz"'

Search Results

1. Aligning language models with human preferences

2. Foundational Challenges in Assuring Alignment and Safety of Large Language Models

3. Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

4. Towards Understanding Sycophancy in Language Models

5. Compositional preference models for aligning LMs

6. The Reversal Curse: LLMs trained on 'A is B' fail to learn 'B is A'

7. Taken out of context: On measuring situational awareness in LLMs

8. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

9. Inverse Scaling: When Bigger Isn't Better

10. Training Language Models with Language Feedback at Scale

11. Improving Code Generation by Training with Natural Language Feedback

12. Models of symbol emergence in communication: a conceptual review and a guide for avoiding local minima

13. Pretraining Language Models with Human Preferences

14. Aligning Language Models with Preferences through f-divergence Minimization

15. On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

16. RL with KL penalties is better viewed as Bayesian inference

17. A continuity of Markov blanket interpretations under the Free Energy Principle

18. Controlling Conditional Language Models without Catastrophic Forgetting

19. Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication

20. Energy-Based Models for Code Generation under Compilability Constraints

21. Measuring non-trivial compositionality in emergent communication

22. Developmentally motivated emergence of compositional communication via template transfer

23. Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish

24. The Emergence of Action-grounded Compositional Communication

25. Fine-tuning Tree-LSTM for phrase-level sentiment classification on a Polish dependency treebank. Submission to PolEval task 2

26. Fine-Tuning Tree-LSTM for Phrase-Level Sentiment Classification on a Polish Dependency Treebank

29. Scaffolded Minds And The Evolution Of Content In Signaling Pathways

36. Unsupervised learning and the natural origins of content

39. Enough blanket metaphysics, time for data-driven heuristics.

41. Dlaczego reprezentacje nie trzymają się modeli dynamicznych?

42. A continuity of Markov blanket interpretations under the free-energy principle.

Catalog

Books, media, physical & digital resources