Start Over

A probabilistic interpretation of self-paced learning with applications to reinforcement learning

Authors :: Pascal Klink
Hany Abdulsamad
Boris Belousov
Eramo, Carlo D.
Jan Peters
Joni Pajarinen
Technische Universität Darmstadt
Department of Electrical Engineering and Automation
Aalto-yliopisto
Aalto University
Source :: Scopus-Elsevier, Aalto University
Abstract: Funding Information: This project has received funding from the DFG project PA3179/1-1 (ROBOLEAP) and from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 640554 (SKILLS4ROBOTS). Calculations for this research were conducted on the Lichtenberg high performance computer of the TU Darmstadt. Publisher Copyright: © 2021 Pascal Klink, Hany Abdulsamad, Boris Belousov, Carlo D'Eramo, Jan Peters, Joni Pajarinen. Across machine learning, the use of curricula has shown strong empirical potential to improve learning from data by avoiding local optima of training objectives. For reinforcement learning (RL), curricula are especially interesting, as the underlying optimization has a strong tendency to get stuck in local optima due to the exploration-exploitation trade-off. Recently, a number of approaches for an automatic generation of curricula for RL have been shown to increase performance while requiring less expert knowledge compared to manually designed curricula. However, these approaches are seldomly investigated from a theoretical perspective, preventing a deeper understanding of their mechanics. In this paper, we present an approach for automated curriculum generation in RL with a clear theoretical underpinning. More precisely, we formalize the well-known self-paced learning paradigm as inducing a distribution over training tasks, which trades off between task complexity and the objective to match a desired task distribution. Experiments show that training on this induced distribution helps to avoid poor local optima across RL algorithms in different tasks with uninformative rewards and challenging exploration requirements.

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
Reinforcement learning
Self-paced learning
Curriculum learning
Tempered inference
Rl-as-inference
Machine Learning (cs.LG)

Details

Database :: OpenAIRE
Journal :: Scopus-Elsevier, Aalto University
Accession number :: edsair.doi.dedup.....cb9468b3c8d190578d7b0361482f7b20

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A probabilistic interpretation of self-paced learning with applications to reinforcement learning

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A probabilistic interpretation of self-paced learning with applications to reinforcement learning

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources