Abstrakti
We propose constrained Earth Mover’s Distance (CEMD) Imitation Q-learning
that combines the exploration power of Reinforcement Learning (RL) and the
sample efficiency of Imitation Learning (IL). Sample efficiency makes Imitation
Q-learning a suitable approach for robot learning. For Q-learning, immediate
rewards can be efficiently computed by a greedy variant of Earth Mover’s Distance
(EMD) between the observed state-action pairs and state-actions in stored expert
demonstrations. In CEMD, we constrain the otherwise non-stationary greedy EMD
reward by proposing a greedy EMD upper bound estimate and a generic Q-learning
lower bound. In PyBullet continuous control benchmarks, CEMD is more sample
efficient, achieves higher performance and yields less variance than its competitors.
that combines the exploration power of Reinforcement Learning (RL) and the
sample efficiency of Imitation Learning (IL). Sample efficiency makes Imitation
Q-learning a suitable approach for robot learning. For Q-learning, immediate
rewards can be efficiently computed by a greedy variant of Earth Mover’s Distance
(EMD) between the observed state-action pairs and state-actions in stored expert
demonstrations. In CEMD, we constrain the otherwise non-stationary greedy EMD
reward by proposing a greedy EMD upper bound estimate and a generic Q-learning
lower bound. In PyBullet continuous control benchmarks, CEMD is more sample
efficient, achieves higher performance and yields less variance than its competitors.
Alkuperäiskieli | Englanti |
---|---|
Tila | Julkaistu - 2022 |
OKM-julkaisutyyppi | Ei OKM-tyyppiä |
Tapahtuma | 36th Conference on Neural Information Processing Systems - Deep Reinforcement Learning Workshop - New Orleans Ernest N. Morial Convention Center, New Orleans, Yhdysvallat Kesto: 28 marrask. 2022 → 9 jouluk. 2022 https://nips.cc/Conferences/2022/ScheduleMultitrack?event=49989 |
Workshop
Workshop | 36th Conference on Neural Information Processing Systems - Deep Reinforcement Learning Workshop |
---|---|
Lyhennettä | NeurIPS 2022 Deep RL workshop |
Maa/Alue | Yhdysvallat |
Kaupunki | New Orleans |
Ajanjakso | 28/11/22 → 9/12/22 |
www-osoite |