Spatial Understanding from Videos: Structured Prompts Meet Simulation Data | Read Paper on Bytez