Optimizing Large Language Model Training Using FP4 Quantization | Read Paper on Bytez