SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities | Read Paper on Bytez