Why are Visually-Grounded Language Models Bad at Image Classification? | Read Paper on Bytez