Position: The Most Expensive Part of an LLM *should* be its Training Data | Read Paper on Bytez