VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers | Read Paper on Bytez