VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary | Read Paper on Bytez