Why is ADAM loss-scale invariant?