LaViDa: A Large Diffusion Model for Vision-Language Understanding | Read Paper on Bytez