MDN: Parallelizing Stepwise Momentum for Delta Linear Attention | Read Paper on Bytez