TY - GEN
T1 - Explorative Probabilistic Planning with Unknown Target Locations
AU - Nawaz, Farhad
AU - Ornik, Melkior
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/14
Y1 - 2020/12/14
N2 - Motion planning in an unknown environment demands synthesis of an optimal control policy that balances between exploration and exploitation. In this paper, we present the environment as a labeled graph where the labels of states are initially unknown, and consider a motion planning objective to fulfill a generalized reach-avoid specification given on these labels in minimum time. By describing the record of visited labels as an automaton, we translate our problem to a Canadian traveler problem on an adapted state space. We propose a strategy that enables the agent to perform its task by exploiting possible a priori knowledge about the labels and the environment and incrementally revealing the environment online. Namely, the agent plans, follows, and replans the optimal path by assigning edge weights that balance between exploration and exploitation, given the current knowledge of the environment. We illustrate our strategy on the setting of an agent operating on a two-dimensional grid environment.
AB - Motion planning in an unknown environment demands synthesis of an optimal control policy that balances between exploration and exploitation. In this paper, we present the environment as a labeled graph where the labels of states are initially unknown, and consider a motion planning objective to fulfill a generalized reach-avoid specification given on these labels in minimum time. By describing the record of visited labels as an automaton, we translate our problem to a Canadian traveler problem on an adapted state space. We propose a strategy that enables the agent to perform its task by exploiting possible a priori knowledge about the labels and the environment and incrementally revealing the environment online. Namely, the agent plans, follows, and replans the optimal path by assigning edge weights that balance between exploration and exploitation, given the current knowledge of the environment. We illustrate our strategy on the setting of an agent operating on a two-dimensional grid environment.
UR - http://www.scopus.com/inward/record.url?scp=85099883915&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099883915&partnerID=8YFLogxK
U2 - 10.1109/CDC42340.2020.9304481
DO - 10.1109/CDC42340.2020.9304481
M3 - Conference contribution
AN - SCOPUS:85099883915
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 2732
EP - 2737
BT - 2020 59th IEEE Conference on Decision and Control, CDC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 59th IEEE Conference on Decision and Control, CDC 2020
Y2 - 14 December 2020 through 18 December 2020
ER -