Mimic In-Context Learning for Multimodal Tasks | Read Paper on Bytez