Differentiable Reward Optimization for LLM based TTS system | Read Paper on Bytez