Learning Vision and Language Concepts for Controllable Image Generation | Read Paper on Bytez