Search

Your search keyword '"Pires, Bernardo Avila"' showing total 21 results

Search Constraints

Start Over You searched for: Author "Pires, Bernardo Avila" Remove constraint Author: "Pires, Bernardo Avila"
21 results on '"Pires, Bernardo Avila"'

Search Results

1. A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning

2. Offline Regularised Reinforcement Learning for Large Language Models Alignment

3. Understanding the performance gap between online and offline alignment algorithms

4. Human Alignment of Large Language Models through Online Preference Optimisation

5. Off-policy Distributional Q($\lambda$): Distributional RL without Importance Sampling

6. Generalized Preference Optimization: A Unified Approach to Offline Alignment

7. DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

8. Understanding plasticity in neural networks

9. Hierarchical Reinforcement Learning in Complex 3D Environments

10. Understanding Self-Predictive Learning for Reinforcement Learning

11. The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning

12. BYOL-Explore: Exploration by Bootstrapped Prediction

13. Neural Recursive Belief States in Multi-Agent Reinforcement Learning

14. Geometric Entropic Exploration

15. Bootstrap your own latent: A new approach to self-supervised Learning

16. Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning

17. World Discovery Models

18. Multiclass Classification Calibration Functions

19. Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models

20. Statistical Linear Estimation with Penalized Estimators: an Application to Reinforcement Learning

Catalog

Books, media, physical & digital resources