Metadata Conditioning Accelerates Language Model Pre-training | Read Paper on Bytez