Group-Level Data Selection for Efficient Pretraining | Read Paper on Bytez