ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning | Read Paper on Bytez