UniViT: Unifying Image and Video Understanding in One Vision Encoder | Read Paper on Bytez