Training Transformers with 4-bit Integers | Read Paper on Bytez