Learn PPO

More precisely:

  • understand how and why the clipping is there
  • implement it on some environment

2022-02-06 21:13

Read the paper Proximal Policy Optimization Algorithms on the train. It does not include the actual math, which is in Trust Region Policy Optimization (paper). would be nice to make sure the math checks out.