Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | Read Paper on Bytez