TY - JOUR
T1 - Repeated inverse reinforcement learning
AU - Amin, Kareem
AU - Jiang, Nan
AU - Singh, Satinder
N1 - Funding Information:
This work was supported in part by NSF grant IIS 1319365 (Singh & Jiang) and in part by a Rackham Predoctoral Fellowship from the University of Michigan (Jiang). Any opinions, findings, conclusions, or recommendations expressed here are those of the authors and do not necessarily reflect the views of the sponsors.
Publisher Copyright:
© 2017 Neural information processing systems foundation. All rights reserved.
PY - 2017
Y1 - 2017
N2 - We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results.
AB - We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results.
UR - http://www.scopus.com/inward/record.url?scp=85046996869&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046996869&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85046996869
SN - 1049-5258
VL - 2017-December
SP - 1816
EP - 1825
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 31st Annual Conference on Neural Information Processing Systems, NIPS 2017
Y2 - 4 December 2017 through 9 December 2017
ER -