Learn arithmetic coding

I want to understand the connection between entropy and optimal coding, and how to do optimal coding for a long sequence.