Off-Policy Interval Estimation with Lipschitz Value Iteration

Publication
Advances in Neural Information Processing Systems