Research Output per year
Research Output 2009 2019
2019
Off-policy evaluation and learning from logged bandit feedback: Error reduction via surrogate policy
Xie, Y., Liu, Q., Zhou, Y., Liu, B., Wang, Z. & Peng, J., Jan 1 2019.Research output: Contribution to conference › Paper
Maximum likelihood
Feedback
evaluation
learning
Recommender systems