When and How Does CLIP Enable Domain and Compositional Generalization? | Read Paper on Bytez