Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data | Read Paper on Bytez