LocCa: Visual Pretraining with Location-aware Captioners | Read Paper on Bytez