Behavior cloning

DeepMind ACME code: https://github.com/deepmind/acme/tree/master/acme/agents/jax/bc

Look at trajectories, just learn a supervised mapping from states to actions.