Deep deterministic policy gradient

This note has no content.