Start Over

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Authors :: Julien Audiffren
Valko, M.
Lazaric, A.
Ghavamzadeh, M.
Centre de Mathématiques et de Leurs Applications (CMLA)
École normale supérieure - Cachan (ENS Cachan)-Centre National de la Recherche Scientifique (CNRS)
Sequential Learning (SEQUEL)
Inria Lille - Nord Europe
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL)
Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)
ANR-14-CE24-0010,ExTra-Learn,Extraction et transfert de connaissances dans l'apprentissage par renforcement(2014)
European Project: 270327,EC:FP7:ICT,FP7-ICT-2009-6,COMPLACS(2011)
Source :: International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Jul 2015, Bueons Aires, Argentina, Scopus-Elsevier
Publication Year :: 2015
Publisher :: HAL CCSD, 2015.
Abstract: International audience; A popular approach to apprenticeship learning (AL) is to formulate itas an inverse reinforcement learning (IRL) problem. The MaxEnt-IRL algorithm successfully integrates the maximum entropy principleinto IRL and unlike its predecessors, it resolves theambiguity arising from the fact that a possibly large number of policies couldmatch the expert's behavior. In this paper, we study an AL setting in which inaddition to the expert's trajectories,a number of unsupervised trajectories is available. We introduce MESSI,a novel algorithm that combines MaxEnt-IRLwith principles coming from semi-supervised learning. In particular, MESSIintegrates the unsupervised data intothe MaxEnt-IRL framework using a pairwise penalty on trajectories. Empiricalresults in a highway driving and grid-world problems indicate that MESSI is able to take advantage of the unsupervised trajectories and improve the performance ofMaxEnt-IRL.

Subjects :: semi-supervised learning
reinforcement learning
ComputingMethodologies_PATTERNRECOGNITION
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]
apprenticeship learning

Details

Language :: English
Database :: OpenAIRE
Journal :: International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Jul 2015, Bueons Aires, Argentina, Scopus-Elsevier
Accession number :: edsair.dedup.wf.001..84432c7a4a3b5879ff395320481d2c18

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources