Image Captioners Are Scalable Vision Learners Too | Read Paper on Bytez