Back to Search
Start Over
Importance Sampling for Fair Policy Selection
- Source :
-
Grantee Submission . 2017. - Publication Year :
- 2017
-
Abstract
- We consider the problem of off-policy policy selection in reinforcement learning: using historical data generated from running one policy to compare two or more policies. We show that approaches based on importance sampling can be "unfair"--they can select the worse of two policies more often than not. We give two examples where the unfairness of importance sampling could be practically concerning. We then present sufficient conditions to theoretically guarantee fairness and a related notion of safety. Finally, we provide a practical importance sampling-based estimator to help mitigate one of the systematic sources of unfairness resulting from using importance sampling for policy selection. [This paper was published in "Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence" (33rd, Sydney, Australia, August 11-15, 2017).]
Details
- Language :
- English
- Database :
- ERIC
- Journal :
- Grantee Submission
- Publication Type :
- Report
- Accession number :
- ED586042
- Document Type :
- Reports - Research<br />Speeches/Meeting Papers