Integrating Text and Image Pre-training for Multi-modal Algorithmic Reasoning | Read Paper on Bytez