Projections for Approximate Policy Iteration Algorithms

Riad Akrour, Joni Pajarinen, Gerhard Neumann, Jan Peters

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Approximate policy iteration is a class of reinforcement learning (RL) algorithms where the policy is encoded using a function approximator and which has been especially prominent in RL with continuous action spaces. In this class of RL algorithms, ensuring increase of the policy return during policy update often requires to constrain the change in action distribution. Several approximations exist in the literature to solve this constrained policy update problem. In this paper, we propose to improve over such solutions by introducing a set of projections that transform the constrained problem into an unconstrained one which is then solved by standard gradient descent. Using these projections, we empirically demonstrate that our approach can improve the policy update solution and the control over exploration of existing approximate policy iteration algorithms.
Original languageEnglish
Title of host publication36th International Conference on Machine Learning, ICML 2019
Pages267-276
ISBN (Electronic)9781510886988
Publication statusPublished - Jun 2019
Publication typeA4 Article in a conference publication
EventInternational Conference on Machine Learning - Long Beach, United States
Duration: 9 Jun 201915 Jun 2019

Conference

ConferenceInternational Conference on Machine Learning
CountryUnited States
CityLong Beach
Period9/06/1915/06/19

Publication forum classification

  • Publication forum level 2

Fingerprint Dive into the research topics of 'Projections for Approximate Policy Iteration Algorithms'. Together they form a unique fingerprint.

Cite this