A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang

Research output: Contribution to journalConference articlepeer-review

Fingerprint

Dive into the research topics of 'A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes'. Together they form a unique fingerprint.

Keyphrases

Mathematics

Engineering