It's a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data | Read Paper on Bytez