Online planning for decentralized stochastic control with partial history sharing

Kaiqing Zhang, Erik Miehling, Tamer Basar

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In decentralized stochastic control, standard approaches for sequential decision-making, e.g. dynamic programming, quickly become intractable due to the need to maintain a complex information state. Computational challenges are further compounded if agents do not possess complete model knowledge. In this paper, we take advantage of the fact that in many problems agents share some common information, or history, termed partial history sharing. Under this information structure the policy search space is greatly reduced. We propose a provably convergent, online tree-search based algorithm that does not require a closed-form model or explicit communication among agents. Interestingly, our algorithm can be viewed as a generalization of several existing heuristic solvers for decentralized partially observable Markov decision processes. To demonstrate the applicability of the model, we propose a novel collaborative intrusion response model, where multiple agents (defenders) possessing asymmetric information aim to collaboratively defend a computer network. Numerical results demonstrate the performance of our algorithm.

Original languageEnglish (US)
Title of host publication2019 American Control Conference, ACC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9781538679265
StatePublished - Jul 2019
Externally publishedYes
Event2019 American Control Conference, ACC 2019 - Philadelphia, United States
Duration: Jul 10 2019Jul 12 2019

Publication series

NameProceedings of the American Control Conference
ISSN (Print)0743-1619


Conference2019 American Control Conference, ACC 2019
Country/TerritoryUnited States

ASJC Scopus subject areas

  • Electrical and Electronic Engineering


Dive into the research topics of 'Online planning for decentralized stochastic control with partial history sharing'. Together they form a unique fingerprint.

Cite this