Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation | Read Paper on Bytez