bytez
Search
Feed
Models
Agent
Devs
Plan
docs
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts | Read Paper on Bytez