Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

Devs

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training | Read Paper on Bytez