Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model

Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model