Verifiable Reinforcement Learning via Policy Extraction | Read Paper on Bytez