Approximate exploitability: Learning a best response in large games
2020·Arxiv