NIPS-1999-policy-gradient-methods-for-reinforcement-learning-with-function-approximation-Paper.pdf