ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | Read Paper on Bytez