MoMa: Modulating Mamba for Adapting Image Foundation Models to Video Recognition | Read Paper on Bytez