GPU-Accelerated Policy Optimization via Batch Automatic Differentiation of Gaussian Processes for Real-World Control

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

5 Citations (Scopus)
17 Downloads (Pure)

Abstract

The ability of Gaussian processes (GPs) to predict the behavior of dynamical systems as a more sample-efficient alternative to parametric models seems promising for real-world robotics research. However, the computational complexity of GPs has made policy search a highly time and memory consuming process that has not been able to scale to larger problems. In this work, we develop a policy optimization method by leveraging fast predictive sampling methods to process batches of trajectories in every forward pass, and compute gradient updates over policy parameters by automatic differentiation of Monte Carlo evaluations, all on GPU. We demonstrate the effectiveness of our approach in training policies on a set of reference-tracking control experiments with a heavy-duty machine. Benchmark results show a significant speedup over exact methods and showcase the scalability of our method to larger policy networks, longer horizons, and up to thousands of trajectories with a sublinear drop in speed.
Original languageEnglish
Title of host publication2022 IEEE International Conference on Robotics and Automation, ICRA 2022
PublisherIEEE
Pages10557–10563
Number of pages7
ISBN (Electronic)978-1-7281-9681-7
ISBN (Print)978-1-7281-9682-4
DOIs
Publication statusPublished - 2022
Publication typeA4 Article in conference proceedings
EventIEEE International Conference on Robotics and Automation - Philadelphia, United States
Duration: 23 May 202227 May 2022

Publication series

NameIEEE International Conference on Robotics and Automation
ISSN (Print)2152-4092
ISSN (Electronic)2379-9544

Conference

ConferenceIEEE International Conference on Robotics and Automation
Abbreviated titleICRA
Country/TerritoryUnited States
CityPhiladelphia
Period23/05/2227/05/22

Keywords

  • Automation, Hydraulic Machines, Machine Learning, Policy Optimization, Gaussian Processes, Neural Networks

Publication forum classification

  • Publication forum level 1

Fingerprint

Dive into the research topics of 'GPU-Accelerated Policy Optimization via Batch Automatic Differentiation of Gaussian Processes for Real-World Control'. Together they form a unique fingerprint.

Cite this