Hyperparameter Transfer Enables Consistent Gains of Matrix-Preconditioned Optimizers Across Scales | Read Paper on Bytez