TY - GEN
T1 - Exploiting domain knowledge in planning for uncertain robot systems modeled as POMDPs
AU - Candido, Salvatore
AU - Davidson, James
AU - Hutchinson, Seth
PY - 2010/8/26
Y1 - 2010/8/26
N2 - We propose a planning algorithm that allows user-supplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as partially observable Markov decision processes (POMDPs). POMDP models, which are increasingly popular in the robotics literature, permit a planner to consider future uncertainty in both the application of actions and sensing of observations. With our approach, domain experts can inject specialized knowledge into the planning process by providing a set of local policies that are used as primitives by the planner. If the local policies are chosen appropriately, the planner can evaluate further into the future, even for large problems, which can lead to better overall policies at decreased computational cost. We use a structured approach to encode the provided domain knowledge into the value function approximation. We demonstrate our approach on a multi-robot fire fighting problem, in which a team of robots cooperates to extinguish a spreading fire, modeled as a stochastic process. The state space for this problem is significantly larger than is typical in the POMDP literature, and the geometry of the problem allows for the application of an intuitive set of local policies, thus demonstrating the effectiveness of our approach.
AB - We propose a planning algorithm that allows user-supplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as partially observable Markov decision processes (POMDPs). POMDP models, which are increasingly popular in the robotics literature, permit a planner to consider future uncertainty in both the application of actions and sensing of observations. With our approach, domain experts can inject specialized knowledge into the planning process by providing a set of local policies that are used as primitives by the planner. If the local policies are chosen appropriately, the planner can evaluate further into the future, even for large problems, which can lead to better overall policies at decreased computational cost. We use a structured approach to encode the provided domain knowledge into the value function approximation. We demonstrate our approach on a multi-robot fire fighting problem, in which a team of robots cooperates to extinguish a spreading fire, modeled as a stochastic process. The state space for this problem is significantly larger than is typical in the POMDP literature, and the geometry of the problem allows for the application of an intuitive set of local policies, thus demonstrating the effectiveness of our approach.
UR - http://www.scopus.com/inward/record.url?scp=77955809740&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77955809740&partnerID=8YFLogxK
U2 - 10.1109/ROBOT.2010.5509494
DO - 10.1109/ROBOT.2010.5509494
M3 - Conference contribution
AN - SCOPUS:77955809740
SN - 9781424450381
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 3596
EP - 3603
BT - 2010 IEEE International Conference on Robotics and Automation, ICRA 2010
T2 - 2010 IEEE International Conference on Robotics and Automation, ICRA 2010
Y2 - 3 May 2010 through 7 May 2010
ER -