VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM | Read Paper on Bytez