Can We Predict Performance of Large Models across Vision-Language Tasks? | Read Paper on Bytez