Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP | Read Paper on Bytez