Generating Images with Multimodal Language Models | Read Paper on Bytez