MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training | Read Paper on Bytez