Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals

Navid Aghasadeghi, Timothy Bretl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we consider the problem of inverse reinforcement learning for a particular class of continuous-time stochastic systems with continuous state and action spaces, under the assumption that both the cost function and the optimal control policy are parametric with known basis functions. Our goal is to produce a cost function for which a given policy, observed in experiment, is optimal. We proceed by enforcing a constraint on the relationship between input noise and input cost that produces a maximum entropy distribution over the space of all sample paths. We apply maximum likelihood estimation to approximate the parameters of this distribution (hence, of the cost function) given a finite set of sample paths. We iteratively improve our approximation by adding to this set the sample path that would be optimal given our current estimate of the cost function. Preliminary results in simulation provide empirical evidence that our algorithm converges.

Original languageEnglish (US)
Title of host publicationIROS'11 - 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
Subtitle of host publicationCelebrating 50 Years of Robotics
Pages1561-1566
Number of pages6
DOIs
StatePublished - 2011
Event2011 IEEE/RSJ International Conference on Intelligent Robots and Systems: Celebrating 50 Years of Robotics, IROS'11 - San Francisco, CA, United States
Duration: Sep 25 2011Sep 30 2011

Publication series

NameIEEE International Conference on Intelligent Robots and Systems

Other

Other2011 IEEE/RSJ International Conference on Intelligent Robots and Systems: Celebrating 50 Years of Robotics, IROS'11
Country/TerritoryUnited States
CitySan Francisco, CA
Period9/25/119/30/11

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals'. Together they form a unique fingerprint.

Cite this