Multi-Token Prediction Needs Registers | Read Paper on Bytez