Information-Theoretic Reward Decomposition for Generalizable RLHF | Read Paper on Bytez