Pix2seq: A Language Modeling Framework for Object Detection