Trust Region Reward Optimization and Proximal Inverse Reward Optimization Algorithm | Read Paper on Bytez