Achieving Cross Modal Generalization with Multimodal Unified Representation | Read Paper on Bytez