MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks | Read Paper on Bytez