Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation | Read Paper on Bytez