MARFT: Multi-Agent Reinforcement Fine-Tuning | Read Paper on Bytez