TY - GEN
T1 - Rationally inattentive Markov decision processes over a finite horizon
AU - Shafieepoorfard, Ehsan
AU - Raginsky, Maxim
N1 - Funding Information:
Research supported in part by the NSF under CAREER award no. CCF-1254041 and by the Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF–0939370.
Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - The framework of Rationally Inattentive Markov Decision Processes (RIMDPs) is an extension of Partially Observable Markov Decision Processes (POMDP) to the case when the observation kernel that governs the information gathering process is also selected by the decision maker. At each time, an observation kernel is chosen subject to a constraint on the Shannon conditional mutual information between the history of states and the current observation given the history of past observations. This set-up naturally arises in the context of networked control systems, artificial intelligence, and economic decision-making by boundedly rational agents. We show that, under certain structural assumptions on the information pattern and on the optimal policy, Bellman's Principle of Optimality can be used to derive a general dynamic programming recursion for this problem that reduces to solving a sequence of conditional rate-distortion problems.
AB - The framework of Rationally Inattentive Markov Decision Processes (RIMDPs) is an extension of Partially Observable Markov Decision Processes (POMDP) to the case when the observation kernel that governs the information gathering process is also selected by the decision maker. At each time, an observation kernel is chosen subject to a constraint on the Shannon conditional mutual information between the history of states and the current observation given the history of past observations. This set-up naturally arises in the context of networked control systems, artificial intelligence, and economic decision-making by boundedly rational agents. We show that, under certain structural assumptions on the information pattern and on the optimal policy, Bellman's Principle of Optimality can be used to derive a general dynamic programming recursion for this problem that reduces to solving a sequence of conditional rate-distortion problems.
UR - http://www.scopus.com/inward/record.url?scp=85050948091&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050948091&partnerID=8YFLogxK
U2 - 10.1109/ACSSC.2017.8335416
DO - 10.1109/ACSSC.2017.8335416
M3 - Conference contribution
AN - SCOPUS:85050948091
T3 - Conference Record of 51st Asilomar Conference on Signals, Systems and Computers, ACSSC 2017
SP - 621
EP - 627
BT - Conference Record of 51st Asilomar Conference on Signals, Systems and Computers, ACSSC 2017
A2 - Matthews, Michael B.
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 51st Asilomar Conference on Signals, Systems and Computers, ACSSC 2017
Y2 - 29 October 2017 through 1 November 2017
ER -