Robust and Diverse Multi-Agent Learning via Rational Policy Gradient | Read Paper on Bytez