Mimir: Improving Video Diffusion Models for Precise Text Understanding | Read Paper on Bytez