Variational multiscale reinforcement learning for discovering reduced order closure models of nonlinear spatiotemporal transport systems | Read Paper on Bytez