Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL | Read Paper on Bytez