Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs | Read Paper on Bytez