SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding | Read Paper on Bytez